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(54) Worid wide web news retrieval systenn 

(57) A World Wide Web site data retrieval system 
includes an input device fcr inputting data and com- 
mands to access the World Wide Web, and a memory 
for storing a Web site data retrieval driver which 
includes a Web reader, stored Web site address infor- 
mation, stored Web site comnnands. and stored format 
information. The memory also stores process steps to 
connect to a Web site and to issue commands within the 
connected Web site, and a connection to the World 
Wide Web. The system includes a processor for launch- 
ing the Web site data retrieval driver In response to a 
command to access Ih^a World Wide Web. The Web site 
retrieval driver, upon being launched, (1) launches the 



Web reader to connect to the World Wide Web via the 
connection. (2) retrieves the Web site address informa- 
tion and Web site commands. (3) instructs the Web 
reader to access the Web site based on the Web site 
address information and Web site commands, (4) down- 
loads Web site data from the Web site based on the 
Web site comrronds, (5) stores the Web site data in a 
linear dociimenl, (6) repeats steps 1 through 5 until all 
addre'=';es in the stored Web site address information 
have been accessed, and (7) formats the linear docu- 
ment into a personaii zed document based on the format 
information. 
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Description 

BACKGROUND OF THE INVENTION 
5 Fi^MOfThe Ifw^qn 

The invention relates to a data retrieval system which automatically traverses hi7>erm«dia documents on a compu- 
ter network and automatically retrieves information from those documents based on a match batween the structure of 
the documents and a personalized data retrieval structure. More particularly, the invention can retrieve articles from a 

7a news service, from a magazine service; or from a combination of both services which are located on the World Wide 
Web. a private computer network that supports hypermedia links, or any other hypermedia-linked compiler system. 

For example, there exists a Web site for retrieving r,em articles Irom the New York Times and a Web site for retriev- 
ing articles from People magazine. The retrieval system of the invention can traverse thrcugh such Web sites and select 
articles based on a personalized data retrieval structure. The personan?:ed data retrieval structure may include oom- 

ts mands to retrieve a full text of the front page only, headlines of the business section, headlines of the stock section and 
sports section, etc. In addition, the personalized data retrieval {structure may include content-based rules to retrieve arti- 
cles wHh certain keywords, to exclude articles with certain keywords, or to include articles based on a rule-based con- 
tent analysis. The invention also provides a method for synthesizing all retrieved news articles and printing the 
synthesized news articles into a newspaper-t>pe format in which each of the articles is arranged based on a user's pre- 

20 d^ined layout. 

While the above example is in the come>A of tha Web, hyperrnffdia docum^iriii can reside on other t^es of net- 
works besides the Web, such as an intranet. An inn anet is a private ccrrputor netv/ork tliat is not connected to outside 
computer networks. For example, a cjompany's own computer network coiid be art irtl'-anet with f lyparmedia documents 
on it. For brevity, the following discussion is .r.ao.D with r&^pecl to thG World Wld3 W«b. HcwGver, It should be under- 

25 Stood that the invention afjpiics squally well to any typ9 of conpufar rnitwoilc that coritains hypermedia documents, 
such as an intranet, different hypermsdia-linkoij computer nstv/orks that reside on the Iriftr^rnet ether than the Web. etc. 

A hypermedia document on the Web can span multiple Websites. Such documents c-an be newspapers, news arti- 
cles, magazines, catalogs, manuals, memorarda. and the like. For bre^/ity, the foliating discussion is made with respect 
to sources of news information. Howev*er, rt should b& understood titat the invention applies ociually welt to any other 

30 type of hypermedia document. 

DescriDtion Of The Related Art 

The WbrW Wide Web is an on-line source of hypermedia dxiimariis contairiirg l'i>'pefiTie:i;a iext and images that 
3S act as links to other documents. Web site-s. ate. As a result. rtocuiTiente on the IVsb wi organized sequentially. 
Rather, a user is automatically linkad to othsr Jccuments. or \V«sb sitiij; to c:;rnp!eTe u t^i; vit.wing oi a document by select- 
ing a hypermedia link, such asa te^rt link or an iru^ ge Jink, document, .-.cc^iidingly. an crvjre document cannot 
be viewed by scrolling through tsct. 

One popular use of the Web is on-line p^iiLcaHon c:nd diulrlbuticn of .'/lagazines and 'itf^Gpii|:ers. Currently, many 
40 Web news sendees, such as the Ne^/s/ York Times, allcv/ tha user •:c def'ne Ixy/zcrcis of intfcjre:5t and to receive news 
information, daily or hourly, that i:omains tDx: matching ihe l<eywordis. "iT e news irdorn^a?ion c&n then be delivered to 
the user's computer via tncdi*m or E-rnall. I -lo/yever. iTiost Vi'eb ne a-s site i!ewspa|:€r5:, like ?he Not York Times, include 
too nujch information, most of which has no interest to the user since the information is reti ieved based only on a key- 
word match. 

45 Other sources ot news ininrmaLon are provided throui^h informsition uuppliers like " Individual Inc." Individual Inc. 
supplies users with a brief summary of the top twenty mosi relevant articies based on a user's predefined keywords. 
TNs subscription news service alluwi; the usei to spicify fi>'ti to ten virc.^ &* Mei at/. on kz?/M0i6s, v/hich are then 
prnritized by the user. The infcrmaticTi ser / cc- &e£r?h£s Che Wob :0i i.iaLazines ari: .tipidperji which contain any of 
the keywords. Based on tho kei^word searcims, h 'l^tnty of th3 tvdsI rdtv/a.it tTdcloa ere D^^Iectsd. compiled into a brief 

so one-page summary, and trrnr-mitted .o the use^ vi.'i u3C£im:Ia ht W ta ucar's rjviGv.'. Si':n'tJ.'er, \\: crdsr to review an entire 
document rather than the summary, the user .must log onto a ipeci'ic Web site o:*ntaifiing tt".e document in order to 
retrieve and review the docuiTC-.nt. 

There are yet other sorvices which permll -he jser to personalitc a n ^vv-spepEr x bti ::ispiayed at the user's tern™- 
nal by storing links to various nsws arlcles fron '/ai ioiiS nei*,*s souicsi on the Web. for -s^air^jjle. CRAYOM "Create Your 

55 Own Newspaper" permrls a user to salect Sipt cif ic sections ; x>m among liii!c& to ovi r ivjenty-f iue different on-line news- 
papers, and to compose the seleA'ons into a personalized revispaper. Using CMfDK ri is jjossitjie to compose a per- 
sonalized newspaper containing, for example, links to the intijrnational Eviction oi t^»e New York Times, the business 
section of the Wall Street Journal, and the &^>orts sectic-n ot thu C:ii ;c:go Tribtne. Tne HTI/IL (hypert^jxl markup Ian- 
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guage) source file for this newspaper Is then stored to ma:;s meula sicraga br l^te; use. 

White the forgoing navs and information services pro^rfde oonvesxait ways to kaep updated on the news, they do 
not allow a user to acx:ess and view ttie news in the way that peq:»le naturally read a real-world nefwspaper. Namely. 
p)eople naturally read a nez/spaper b>' scanning the pages of sectionr; that they find interesting and then reading those 

£ articles that grab their attention. In other words, people iK^e a structi ral approach to decide what pages to look at ini- 
tially (e g. . tfie first page of the Business and Wcrfd sections, and the comics page of tie AjIs section). They then scan 
the selected pages for articles. 

In sum, conventional news and information services oo not allow a user to access data from a hypermedia docu- 
ment on the basis o1 the stmcture of the document and then to format tiat data in a manner that allows the user to scan 

10 and read the data in a natural fashion. 

SUMMARY OF THE INVENTION 

In accordance with one aspect the in^antion acldresjiss the above def iciencies in the art by accessing at least one 

15 hypermedia document, retrieving data from the hyp€2rrn3dia document into an extrartsd data tree, with the data 
retrieved based on a structure of tlie hypsrmedia document, flatlenir.g tiie extracted data tree into a linear document, 
and formatting the linear document Into a fornuitted document. 

In another aspect, the invention creiites a peroonal-n^ws profiis -or retriewng data Irom a hypermedia-linked com- 
puter network. The hypermedl£'4inked &3mputer network is accessed, a learning ixsoc^g Is started* the hypermedia- 

so linked computer network is traversed v/ith commands, at least one rule is extracted from the commands, and the rule(s) 
18 compiled Into the persortal-news-prdiia. 

In yet another aspect, th« irwention creates a perecriulization piOfilefor .ii VVeo ska retrieval data retrieval system. 
Data and commands ai e input to iiccyss tiie VWorld Wide Wefc» and a connection is made to the World Wtde Web. A Web 
reader is launched, and the Web reader accesses the Web via the connection. 1 1 response to user commands, a learn- 

2S ing mode is entered in!o. Commands are sent to traverse the World Wide Web, anj at least one rule is extracted from 
the commands. The ru!e{s) is conpiled into a personaliziation profile, which h stored. 

In yet another asptK:t. the javenlion reirie^/es attclet' from a h>T:.c;mieciia iiiTked lyjn-punr network and formats the 
articles into a personali^^ed ne^/spaptir. A storfed peroonal-.nea's pi'oliie is reiri atfsd. The: pGrsonal-news-proflle includes 
address data for a site on tiie hyperm^tia-iinked conpjter notirvork. (x^rnmand datia tor accessing daia from the site, 

30 and newspaper layout commancte. The £ile is accessed bas^d on address data scored in tie piHsonal>news-prof ile. and 
artk;les at the site are downloaded based on currirnand data s'k^ >jd in the peioOnai-DG/kS-protite. The downloaded arti- 
cles are flattened into a linear document, dncl ±& linear jDcumeix is formattec i»Tto ths psrsoiialized newspaper accord- 
ing to newspaper la/c*ijt comn^iands sto-ed in th& perscjiGJ-nyv^-s-p/cifile. 

In yet another aspect, the ir.vaiitiori /s?tij{y^fes data f?om a World Wide Web sile arid vbrmats the data into a person- 

35 altzed document. A Web stle data retrieval driver whicts i;^.cl{i^i.i'i;i a Web reacie: . i,\arm W-^b sfte address information, 
stored Web site oomrninds. and stored fonriat Information is. ac^essad. The immiion (1} launches the Web reader to 
connect to the World IVide Wet' via a ccnjiection to thiC Web, (2) retrievas the Web site address information and Web 
site commands. {3) ins^trucfe tlu: Web reader to accsss the Web Gite based ot\ IMe V/eb site address information and 
Web site commands, (^) dov/nlotids Web site data fforn t'ts V\»eb i-lte based on the Web site commands, wherein the 

40 data is downloaded reverorice io £ !jrJ^ed W'M i>o as to avcid hyperm&dia-ilrics ifuu -foirn loops and so as to avoid 
repetitious downloading ci data tmx has airsHKdy bi'3cn d /j/nloaded, (6) stopsi; t^e WYiji) 5;>:e data in a linear document. 

(6) repeats steps 2 through 5 until £ll oict reissis Ir: tiis b'..?fed V!eb aite acidrcsu mrnnaton have been accessed, and 

(7) formats the linear documerit into tl:e }.iersora]ia!wnj c'euirne-il c?ased on tha foinnat infciimtion. 

In yet another a£pe<:t. the inventio/t dxestws and lev .eves di.ta fJ World Vi'ide V4-jb sites and formats the data into 
46 a personalized docijineni. Tha Irwunlicvi c^mntcvs to the Wccld Wice Web, fuiivievL--, user dsfined V^feb site address 
information, user deiir'nid Wab i^'Ae ccrcxrtijndf.. ariC nsw: dzi\:\'dc 'cf tmriting ajnirnands, and activales a Web reader so 
as to access a Web ^-ile bas^jd m ti"i* ute: d?iincd Wao iilie ad Jro^s Information. The V\/^:b reader is used to download 
data from the Web bascsd on ^le user diillrttd Wat dli a.mn:^Lnds, and the cia'ia s dui/vrloaded into an extracted data 
tree. The downloading continuui* unUl all addraa;ses in th<2i 'jw-sy ci^-ined W^b s:t*j address ii-formation have been 
so accessed. The extracted data 1r.55 h flat::enE<i Into a iiroii? uocurnsnt, and the fiattenec* document is formatted Into the 
personalized documeni baried in tha u^^^r defined fcrn-c..iiMc; ojir.rriarids. 

In yet another acptict. tho in.r3n:ic.n ; Gtiii/e^ ns/zn aj'iiclas 'fin sn-Hna nETvifj; s.^\Vice^ on the World Wide Web and 
formats the news artiLiK-s into a pt^SDnuKziX .iGu-s^-^ap r Hia ij jL':a.v/j:i:-} ;5>o.etF. a pc-.-s^rui! -news- prof lie which conrprises 
addresses data and corm/i^nd datn ior acccJicrng t-a'^a vr vrn i\ V/u)) site and niv< ipa;.iL;i h:i /nai ccmmaiTds, retrieves the 
55 stored personal-nevijj-pnaf and accc Ht^iiJi -iit? data atorut! t^isicir.. adivat3S a Wub !ti*:dt>r to contact a Web site based 
on address data stcrec? i.i the persona! -n r./s p.rcJ;it. dL'.isVoadj; r:*//£ articles &.ltiii cc/ .tacrt-sd Web site based on com- 
mand data stored in the per3D:»al ne«vo-priy;iln. btc.*: t -Itiv •JaviiLj.de d n^;/3 fXi i Jet>. a:;c formats the i^tored news arti- 
cles into the per^miiied n svispapa jausJ c . 1 na^ vi^iiapt-i l^ur^^iX comma, .ds slo.-^^d in U .6 p^/sorjal-news-profile. 
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In yet another aspect, the rnveritior. formats a h>pe/media dcomerA into a pijrsoralized ciocumenl. A location of 
the hypermedia document is specified, a type of ihe hyperrredia document is specified a scop^ of data to be retrieved 
from the hypermedia document is specified, whereii the scope Is based on a structure of the hypermedia document, 
and a format is specified for formatting the c'ata retrieved from tne rypermedia docum:3nt into the personalized docu- 

5 ment. The hypermedia document found at Ue specified Itxation is accessed, dera is retrieved from the hypermedia 
document in accordance with the specrfled htypernriedia documaii type and in aaoraanc<j wvitf* the spedtied scope, and 
the data is fonnatted into the personalized documsnt in accordance with the specitiai fOf-mat 

In yet another aspect, the invention is a system for processing a hypermedia dccumari. The system accesses the 
hypermedia documerit, extracts adciresses f.om the hypemedia document, and stores the a±lresses extracted from 

10 the hypermedia document in a container, "^he system act^iates a processing iunction to process data stored at the 
addresses stored in the container, dov/nload:3 the data stcred at im addresses r»lcred in Ihe container into a memory, 
and extracts predetermineij data frttm downloaded data iii a:.arcance with predetermined configuration information. 
The predetermined data is then tormatted in accordance vnith orsd sfinec fbrmattinci settings to generate a formatted 
document, and the for matted docun»ent is procassea 1^ accofoan<;e wrtn the process-rig luT\C\ivjt. 

15 In preferred emPadiments, the systerr. ripLrts Ins iorrAatting settings and cjitj t;ju -ation in^crrratiai via a graphical 
user interface. The graphical user it jterfacii Cxirnprises pluiel proceissng icons, or^e which activates the processing 
function. By virtue of tl?e graphical use- irriata-e. a user can f/«& a::^ii'ely set a i^CAVjr \etuS k-^ nmi and change that for- 
mat should a change be desired. 

In particularly preferred embodiments, grapnica! u^ai inteiface is displayuo in plural modes. The plural modes 

20 comprise (1) a fully-functicna! mooe ir» wh:ch ihc? g/aphical t'Scn' inte'lice displays fDirrsttincj fields, processing options, 
menus and the proceu:yng icons, avto (2) c nMini.Tiizing rr^^de in • y'lich ihe gra^ibiou! ur.uf in^erface displays only the 
processing icons. Typicully, the gra[:hical usur irrierfect^i di-^pli/'c-j Ir. ths minimizitig '.^:<}de ii: duriplayed during browsing 
the hypermedia document. By displaying ths grcT-'hicai i.'ser htcs-ff.c^) plural mtxie:: , the present irr/ention facilitates 
operation of the invention during brcuvsing of ihe h^/^ermtidis iucj'ic x. 

25 This summary has beon pnDft/id^d so t'-i!jf the riat .ve of - 'r-i IrvKb \on may be urdtarstood quickly A more complete 
understanding of the in/ejition car ba ot?taini»d by rcmrzr.:^ to tl:e fallowing dfrDv!rd description of the preferred 
embodiments thereof in arinection with Ih:! aCUjhec oravfi - gu*. 

BRIEF PESCRIPTIOM OF LHS_DRAWi>ivg 

30 

Figure 1 is a perspecbve vie* shovwng the ouhvard appaa -ance of the personsJ riews retrieval system according to 
an embodiment of the lrtV3.it!Qn. 

Figure 2 is a block diagiam tuvj personal nsr'MSi /Mr^.s-fcl jysh;;: siKwn in Fiqmi \ . 

Rgure 3, oorrprissid of Figure^i :{'A. 3B, 3C *:i;!d 3D. '^Mx^i rspiC'^ei ivational d.iuji : mir :!iuiiiralir;jj an (^ixarriple of the 
35 transformation of inforfnatiort frcm the VJit (f-igcro 2A] ic r.;:tiatj,;t:d data tree (Firurt-^ 3E). i^cn ta a fliitlened docu- 
mem (Figure 3C). and finally to a foimalUu JooLnient (Flg.rt. 3^) zczrx6\nu to c.*: 5::sbo.:lr:d/.: the invention. 

Rgure 4 is a repres^jniiStaisJ block diii:^ram c; the tvc/.o; b;* v/i.2h a persiJOf^al-news -prof ile for retrieving news 
articles via the Web is cres-tLKJ or ai Vyi accr'rd::'.^ jn ir. e. J"..- \i \. :)\ the 

Figure 5, comprised of Figures 5A im'} o3, * ! oi»/£ I J i :,T:j^t:^z : L^r^critlng ho ; f? p ^ ric-ws-profile is created 
40 or edited. 

Rgure 6 is a representational block Ji^g ar* of tie nicnn^f tvc.icli news ariic' jr^ rt^trieviid from the Web and 
formatted with reference to a psrsanj'-rie^' -prt/^lfi acoord^c b: an or.-Jbcidiment cf l-te ir.^^antjon. 

F^ure 7 is a flow diagram describing ho.v news articif^s are lati .wed from the V/£fc reference to a personal- 
newsi^rofile. 

45 Rgure 8 is a flowdiaQrnm chewing hovv r Gtrie/fjd ntws f-j':if;leG ivo fcrmatled \^i>^ rsfiiii-ence to a personal news pro- 
file and sent to a print d^xioe inteiface. 

Rgures 9A to 9E d<?|:Jt;t a g:£phi:al u? e* nt -nir.ij \ : :fiC( nd emb^r."' ir-i -A {< [r 23£:f t invention. 
Figure 10 is a flow dagrani d.>£irib;ni; * cij-craticr :c -j :i :i\v>)cdm, •.'1 t'':' I'vcn'.ioM 

so DETAILED DESCRIPTIO N Of^ TftE Pni^r^P^ I=MgCiVi^.>rV- 

Rgurel isa viewsho*f^ig tiis •>>A'dri.!L;.p.:arvoic^ ^ \ .rcsa . ^vKvu embodl^/.;:! irivcnticn. ShorA?n in Fig- 
ure 1 is computing equipn*ient 1 . such as £ Mac!.»lc5h ur an 31) PC or a PC-compaJbtes 03npirt.?r. having a windowing 
environment, such as Micos:?! WinJaws. r-icv:::Qd v/ih d.. • jl i tiuii^mont i iiis, iny iiorjir* 2. iiuch as a color 
55 monitor or a monochroriaiic mortor. ?i ^ 3 for ci.il& ire loc ctrj user ccnnr an ^3. iiiid a pointing device such 
as mouse 4 for pointinc and for msnipulat:: iiL;^ afc: oisi::(a: ct\ i iipfey 2. i2ox.:^ol. oc;^ipmGnt 1 atso irtdudes a 
mass storage device sudx as disk drive intake ds^a cs;- ')e irAt<'i into ccnipjiif.g acjuipiriuit 1 from a variety of 
sources such as a net//ork iiit'jjfa::^ i.on. ojjrter.^il -^i- oj3 vi.. feicsinii! e'n oie.n l;.icrac*^ 5. Netvvci'k interface 
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1 la is used to connect computing equipine.Tr 1 to a \ocS urea tiatvvork (IMi) or vj a w ae area netivork (WAN) such as 
the World Wide Web 

Figure 2 Is a detailed block diagram shry;;ir^ the inlGTP! <:rnst'ui7tion of cot ou?in<^ e'Cijipment 1 . As shown in Rg- 
ure 2, confuting equipm^t 1 includes central proressr n unit (CPU) 8 interfec-^;: v^'th comouter bus 9. Also Interfaced 

5 vrith computer txis 9 is printer interlace 1 0. far/moo sni in e 1ac« 6, disday inter! qop 1 1 . noNv34; interface 1 la, keytx)ard 
internee 12. mous* inlerface 13, nreir memofy 14- anci a'isk drive 5. 

Main memory 14 interfaces with coirputer bus 9 so an to provide random rticess. msnr^ry storage for use by CPU 
8 when executing an appiication such perssonal-ne^vs-orofile editor i6 or WDb p inte^ 1"' More specifically, CPU 8 
loads these software applications from d'3k drive 5 into rsiin r^BTnory 14 arid executes the software applications out of 

TO main memory 14. In accordance with user Instructions, storeiJ a-^fication programs are activated which permit 
processing and manipulaton of data. Tyiycaily, the soft^'ve'e applications stored on disic drive 5, such as personal-news- 
prafile editor 16, W(A> primer 17, and HTML tormaller • r>, avt;; ur>eri stored on diskdrive 5 by downloading the software 
applk;ations from a computer-readable mediurr such as a Hoppv dIsJ; or CD ROM or ty downloading the software appli- 
cations from a computer twllt^in boaro. 

15 Disk drive 5 stcrf^es data f3es which can te/l -f'kjf^ an.-f image flies r- ccripressed or uncompressed format, 

and stores software eppljcation U\e^. H^a}i tticsii "'jtr'J ilc/c. ^hs software r.t/c'j'^^^c i VmSS include Windows appli- 
cations. DOS application, aiTd p3r»onr*l r;i».. t, r'/.i-ioiil .l:r» 15. r-arsonal rmr6 ' ii? 'cl f Jds "5 inciude personal-news- 
profile editor 16. Wsii prrAter 17. H'JAl forinattor lb, p^i:i::neil-'>cw.c-p;oi;io(tj) C, and pi'ofile(s) 20. The detailed 
functions of personal r.e//s retrieval lilas 15 will on ci^i uiifcd 'r^c^orw. after a twicf overview of the operation of the per- 

20 sonal new retrieval syateiTj. 

Qverview of PQCtiiT>9>V: ^j<itfi<vyat 

Figure 3, comprlsisd of IRgursu 3A t'j 33, '^lush . t*i i ifj^.tcfi.fSt of a fer^reEer.u^tivi* embodiment of the invention. 
25 Figure 3A is a graphical ri'preserrtatior; cif a f]/pi:::t! A j:> 4. .a r/ilh Aews i;ifor:v«li::n ccn'atned therein. Within Web 
site 21 is homepage 22 wrtfi WnhrAc lnfJic:a£ -t'dch as hii::!ij:ij.: 21, v.t 'ich are in %rn finl , :k1 to articles 24. Some of articles 
24 are linked to other aiti(:ltis. As article H 28 rertce> cn moihej' Web site, link 2*^ is a cress-site link. Link 25 illustrates 
how a single hypermedia dtxun;:Gn£. rBprasont.^ii by c\.:r :pas= ?2, can tfavoi;;^ .T<i.!;i::^.€i V^^b srtes. 

In order to retrieve t\m^ Irom Wjj Hiiij i\ , .» i.v, i uior. I r?H havetaes V^tJ. «..c; 2 • 10 .et/ieve data according to 
30 user-defined rules. Ax; will be dii cuvsvi: 'ri CUlI jhauS vJvo uin au Uaoed :x) ihe slructjre of Web site 21 , 
or on the structurvB of ^ Voj site 21 aii^ lis c.o. ibsnis. fl g Cu ui. r.>u iovad .jsI'j j\ bxmxC^^ CEta tee, which preserves 
the organization erf Urn :^:ua thcui 1 li . F .yuL -iL, i u-'rci; • ne iln!';e ii i 'j\J> cd. 

The organizaliort or ei.trac1ed daia vrea 27 has sevescL fea:'jres. t-lrst. extj«:;ted daiia tree 27 has root 28 which can 
have child nodes for ona or more sWm 29. ^yhich in turn ca 1 ha inoex nodes o } which oon&5,porvji to indices/headings 
3s 23. articles nodes 3*; , :Uii:l the H\kb. Secor d. exiracied data tre^^ 27 is a true tree, v/ith no loops (i.e.. cyclic paths) therein. 
For example, Figure 3A sinews a loop froiri hon-iepa^e 22 13 crxiaK iiccie #1, to ailicle C, and then back to homepage 22. 
This loop is removed when cr^Jc^Jng i^z^za trc * '\ 

Second, the argartizaliorj cf i^ttr^ckd Jti*;i /ue i.. a-ponco c.i hov/ thu '.'V.ib ^Itij; are i/aversed, and not on the 
Web sites' actual layouts. "ITiu.s; s:rtic e I : S'i upp : td^A nix3s (u ddr uits inclicalms that the news 

40 retrieval system accessed article i-l 2S Irarr! -^itfi i;i yi.^t ci :i5U'ii::B m){ 25. 

Finally, as noted ear jer, ctirlain arsilBS fia'. c* L^-sa % JL:'f-.' *■ arn oxtracliic' d i.;c, - iHi 2 / d.>;e b the structure of Web 
site 21 or possibly a :cn1em 0. 1'vilceUt i.;S.w.lr:giii LZ air ; .4. Fcr aain;.ti. 11 avJ G have been excluded 
from extracted data irt^s cO. 

According to - hiu t f r.o.Xi]r:i€ nt c / j le inj i' iU.t <?:it'...:. >..i .i 2 ? is / . .i*.;c i j^ja.' Jocamont 32, as shown 
45 in Figure 3C, posiiib^y A'iL\ retcronc-A iC; t .1 jr > ii>.clu jluv. >. . ..w. .i.zj dxujr.s./. .>L ^si:. Jh.'!y a continuous document with 
information from exiraciec' datti t! ee 27 1: .ibaicitci Intu 

Finally. linear document 3.2 i£i •fo»'meitKd a*jccidi!*.g ir -ittr upecKiod (cr delaLli) hjnr.att.na instrjcUons into format- 
ted document 33. s;iaw.i ar. i: s^ylizid f^ersjona! .ne'A't;i:Jiier in -igae 3D. F.^rriiiAi-ad docufna-'it 33 l^ias various fonts 
and/or colors for sic labdc:. indic3s.'lie?iDingi., {i»liu!eu trtd his ilka. Furfhyinv^ra, ior.riahed documont 33 Is broken 
so down Into pages. 

Note that in ait -r.'iale eirJxoDi.v.e.'^it.. ny.vs ioji^-.ix. cirai.i .iu..^€..: 0! tr.a abovo tiansfDrmationfrom 

Web site 21 to fcrn tailed dccj. loii ci wtin si i^t2\:<. .. : e:.? Hfil€. dat*: ii_ /'.i-. 21 Cc:n be itt'ieved directly 
into flattened docun'eni 2a. »is iong ■ cf i l .iii.n\=:i.c"i of the dats is r.uii.T^ained (possibly in a separate 
linked list) so as to fi'it:;J do.'r.l«-^'jir.c : .m. • rtij,' : » ^ is. ^ i.'.j:c it'.,..s rr» *; e cryanization of Web site 
55 21. Alternatively. el'sr/.j'J dfijiL free a. /dl rJ eu.. v:;-. i 1. fon.rt:ti - i j':*, Ici any case, the basic 

operation of the i.T/e;iii.3r mm .is *,hc '.Siime. U'iv a* ../.-^ejri Iravcru;^ a ':<pi5.Tt»adia document on the Web, 

extractsdata acar^rcji j :o !.ra£:r:le^. il: . i.'./.tu,-..j:.. a- : . . . ' . 0 -^am ir... .ji. j..:.;:..,:. J r.jvvspaprf. 

As mentione<l i:ii .cUbcaciis^ir^ic... t;..V'i;' .ucs £.ri.:! 0: -= ..ijuj.iJ'iC'n (Guci: as formatting infor- 
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mation) are involved in the news retrtevcJ pnxes^:. 'i hat Liar du^ 1)3^.1 inrtrma1ic;i iS stored in )>ersonat-news-profile(s) 
19. the definition of vvliich is desct bed next. 

Defining a Personal-Ne^>s-Qfcflle 

5 

Figures 4 and 5 illustrate Vne cnxess whirh pRrscnsf-nsTftt; j: - jt3e 19 isdntif :?d. cr^i+e personal-news-profile 
19. personal-news-profile editor 16 comnTi;n'cs.te=; wl^i pfSfi-ei-- e..s-o'oti!e 19, f t-^ t^cpWe tlO, and Web reader 34. 

Personal-news-prolile 19 contains inf')*:-nation as to Vitat -atl^ to accGSs fo' r^'seting a personalized newspaper, 
what sections to retrieve from those sites, r*iles to m used 1 1 ietcr.-nine v/hat dSit?. t:) e^ilract from the sections arxl the 

10 articles therein, rules to determine how to excluc^u linte, an ' nevksrvTper format inJ-jrnu' ibn. A sarnple personal-news- 
profite is shown in Appendix 1 . 

Site profile 20 includes general site infcriTation !ti£it is -r. s-peuic to a particular u^.er. F-or example, site profile 20 
could contain intormatior r.uch as full srte addresses secttv^ 5 v:i?iin a site, norvuser specif^: f)asswcrds, etc. Sanple 
site profiles are shown in Appcrtdlx i . BecK?tf:e yerfera s.itc . ^ji rna* is stored in fitteprrJi e 2.). personal -news-profile 

/5 19 can refer to the gaieral site infwj matio* • it^i refererice to f^te profile iiO. saving i-oac^ in the personal-news-profile. 
For example, as shown in AppandiK 1. pers^nal-news-prcf:!*^ 19 cs^r refer to a ate ri'^hsr 1. Site profile 20 indicates 
that site number 1 is tt".e "San Jo'« Mercu i' Na»fS,'* wth a hnfip.nc''p ''lTt1p//w>w/.r»'mer :ury 2onn/" This construction 
also centralizes general site Information. *^ntis if a site atiu'r^iss rt^ n^es. only sit? Drcr'ils 20 'leeds to be changed to 
update all personal-ne\'/ci p-ofii<-s or th3 :ar 

sc Web reader 34 iu an &y^i\lzaUu: pr»>gr£ , i pro, rs' rc :^, t • ' .: . .?:'m-n;jnl( a. ' v/'ji ■ le ' v'eb viii \'t/f3b server 35. 
In response to ccmmandc fro.n p£ i Dn&l-f.tv. i>p jfite e'.'!i1:'i ' I'V • d 34 v.il' -j^ocr^r? t^.e Wdc. trevers^ hyperme- 
dia documents on the lVuc» retrieve dt,'B from Vb cio:aj';i3: •fc:. a .ci . alj'n liie r2i; ij'/t;tJ :'a:a to perscnal-news-profile 
editor 16. 

As shown in Figure <. persona I -news profile (Jdttoj' '^H ia^ludes 'mr riodules: site driver 36. Web reader interface 
25 37. profile manager 33. and fcrnia': edilor 

Web reader interface 37 interfaci^s pe'3^').ia;-i:5\/e-p:r.' I: :-vj:vj- 1 3 tu Wi'c rei:'i*r :-M Sita uriver 30 interacts with 
Web reader 34 via Wet/ rt3i'?tr irCertacs c.7 lo p:ovif c an 5^ . . Ht:i ir .erf?c(3 tc £"fvC K ii-Kiividuai Web siie. More specifi- 
cally; site driver 36 instructs Vi'ab leaiar S^.' I:: access i/arlcas '//o': u Ic-i and to rtt'iiJi'S ciitM /rem tho&s sites. Thereaf- 
ter, site driver 36 receivirij tha; dali and cu:!.-5. Srtc profile 2J tl^Fj-.-^fr • r-i. 'T l-.e data ?[2C bt* used to update an existing 
30 site profile. 

In building site profile 20, 5:13 driver 3? 'ranolc-.ttii '\:^ .U^ ..c U' f. i:30h ac;c.:i.i<^j WKt sit-j tc a uniform structure 
defined in site profile SC. ;ird itures dcva t si? *; t.i :'' 3reh'c 'n 1 i'::"ie 20. \« i itiiti.Tg r /Vare.it Vveb sites, some 
of which may have diffe.tr? irii^jci. rai. itic .1 .:'\i:h) . tjrr . " 1 . i ^Lcit-; :i. " j U* .'.(.jr-r: (} . lai': t.iruClare in site 
profile 20. the present irsvurrlian :ai:i!!ti3toii ccCc-u \j ./ilc.'.n.iirj;. '.or.; u.r'eront V\.:-. iir:v=. ^^ixi Ihub reduces overall 
35 processing time. 

Profile manager 38 maintains dccumenr la:7.p!£ t€5 that SfwUV :iw to fcr»na\ n pcrwrwaiixed newspaper. Prede- 
fined document templatSL p:<i:jt. In aridiuos^ 'oimAl siJi'^ arS? rjJIc.V; 3 usai lospecifv :.)s:ri:o:ializc:d tOiTpplates for format- 
ting a newspaper, eit^^c^r by eJili.':^ OAij:;t::ij [iiHrpiSiia-j -x Ly e^'jng new oneo. In D.iy v^use eiicn dccumant template 
specifies page layout {ntrraiian, font i V rr: afioi;, ctj^'. - :• T:.:' \ rjo'of^. jlc. '* ' Ih : "s^lrM. : v:l:o(H:/l:eadings, sub- 

40 headings, text and the Hkis fcr u purson-::i> ..d ricwjiip.^pe . 

Sample code for pcrsonei na-iv5-prott*j<r3::tr ".r. s'ri.ici ? c.-r;. ..:. .rj (rrcfilt m -iLjiGr';-: is ii iCl-idftci in >^ppendix 3A. 
Rgures 5A and 5R aitj Ttc-v • rfij.g-a'iTb 'iCKr\b^r,-j *J-.e *, a lj.. ;>t3rs;':r;oi-< m ;. t f'ii «i<' x.r t?; In more delaB. 
Figure 5A shows the operation of psrsonal-ns vvs-prof ile ' •Z :n iirfining the pii % of ?rr,c i?u*n£jws-prof Be 19 relat- 
ing to accessing Web skc:s. ar<} »(5tri ?.^'ir.;) .va .ii i ''\u- .. i . 

45 InstepSSOOof Figure. rv^.;.iv ;cnn r.:M J-,;/ } ^ .vi -rifcy s .a^.r • .isp f :0 1, ^hi oditcr Launches 

Webreader34. Theus:.:';i;i:riar.JI.D. i^l .rM ^-c:. . : ipctaiv.. .c' 3 cU:i3> f::i«Jt3 for that 
I.D., step 8503 directs flo<v Ui :vtep 1-50' . '.o c . l _ sti it jl 0 * 1 1*. o^r,.^.: . cf g.u^^. q u i..'.Y,ctc ssdiCor. Otnenwise. 
personal-news-profile editor 1 5 eniex £ * fea rrny mc^le" ir: w.- p C^^ceii-i tl.t jcj i iwQ jnoie, personal-news-pro- 
file editor 16 proceeds 'io s' tp SSOG, Vi/h-)/'j i' a..^j!i.Jtt; V/jfc tv; ; i .\f .cf ., a uj./ a i: s.r I'P.vm: to a i'.>p«£'rm©dia link) 

50 from the user and fcr*»i^.::£ :h ' W«jh coi ^vax.c io t*"-: 'r'lt -i : Ir' tr .n'ig of 3 i: ::C or'wcir 36 maintains a 
hierarchical log d Web Ktcsv'rile. J hy\'/-^^ utV >f. -njl^ ^ a" .>?riona}*nea"c pro'j'-U:?:i:\rr 1Gcr<9c^^^ 
tion njle from the W£b cc.n rnan.i. Piis. : v • J cv 'h3 i v. i^^j;^ system k l^i.a t;u,::ica'.:: uix a::cr't selection 
criteria in browsing (dic'ti'io 'jr hype tii: i^i :\ ij:, V" :L »!t*; 

Therulespecifias.i.:t:\ci*-Ks:, r/ucL£:c.•':'^r:^^^ iu/Ml. i/* ?:r.vers€.; l.^'A :t iiU. T'ar oxairple. if auser 

55 accesses all articles L-ncfc: apailrj'a; LJ..,? iht. 't iip ) l^ut Li r.:*:i:.*.. tbit :ndtx/heading 
should be retrieved. 

In one embodimer.lc't'M: "p.-..:n.io;.. f ' '^:.c.a • : rJr V il-bas^-^c . ri!i 1 ;i. a . .^Sr.vDrd-brssad criteria) 
accepted from the us.;.j. '.lij-.it :>Cen>cc..:i :r . , * . /) r-.f^u.* • i.' ■ in i^Ijt inandrtide. (2) 
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exdude articles with certain vuoixte, (21 u iJ:o ce'trJr 2x : ^ r.?T!>:rut:ors cf .j wi, ^/ .'anix articlos that are selected 
based on structural crrt irls, mh ths re g o Lse: cn : i rvords. ano ihen require t^ie selection of the articles with the 
highest ranking(s), oi [o] exdude ceriiln J>v€* c re. .;tiv.h aJ*i3rttss iieius. 

Examples of the syntax io. Ihe s:/ jclj»a. e/il cor.le. aa:::;:^ ax^l jStOii rui^^ i, k.ivvn iri Appendix 2. Several dif- 

f ferent types of rules are shown. Some siitij.^ iimi tfie tri \ srsa? of a vVed site a csrtsm number of links. Others are 
date and keyword based excluoon rules. Or :e pdrt:Xh3 y . tccy^^^ . (i «5 lixJic&ifea ^rt ctcs should be ranked based on 
a keyword analysis and the Icp scori;ig <ifi:-:^a*. i'';Ouli I Jr Clh.er ru et iPui.i»je ^f.-:ttening'" rules. These rules 
control the flattening ol th9 ejctraclec d&i» tea, a.; v/il La* , <p.f. *fevi in more cs^a < oC.'gw;. 

At the least, the ruie includes stri/ciuftil Irfen^aticr. :ocu\ ;re u-er's se acton (i ^, first page, first document all 

10 links, etc.). necessoPy passworc' inforrn^ticn, bro'^/if^r ccriiirui.vjs. cjio the like. » ns tule can also include a pointer or a 
reference to site profile 20 and .he appropriaLe iifcrme:t;3n Ifterein. General (no. i-usei' sp(icific) information is used by 
site driver 36 to maintain site profile 20. n ihi^ rr.ixMim uuaitr^^ u uumation ani^ oassworcs common to multiple users 
can be maintaine:! in s.te profile 20, as di&:uiised ebovj. For e<s.nn|Me, site drivsu 36 will store commands or hyperlinks 
to other documents in ci Vueb pd^e in tfie . Jt^. Jtt \ hll rt\>. c.tuie ^ vvet> site's full h ddt a&s in t!te rule. 17iat address infbr- 

15 mation is stored in si^e orofile 20. 

In step S508. tu«e dala deriiiiny the ruid cecUjG U-^i.: a \\ coni.Tiar*d{s; w otcied :n an extracted data tree such 
as extracted data tr 2e 27 irt Rgu e 3l3. 'f lis cati i'Si i« li;i.;i.u iiui thai r3;i(i.-,.i5 ilvj ai^-ianiiiticn of the data retrieved 
from the W6b. Instep S509. iia^ relu.nn ;o 'J3p SfSOSIrv :it5»!;:jr; J convnirxl urisss. Lie Lserisdonefl.a. the user 
signs-off the Web s te). irt '.vf-iich t^s:^ ;bvi'j:rc; i£Oc; ij .-•..>• Li.' U. 

so At this point, tha creator. tht p^^ lui'tJ r.£/JC-:;.^f(, hc:« : o:i?<ci&:i irj h liko ^t? creation of a i nacro convron 
to word proces^in^ prt,>yran^&, oa;£;'t v.%c.1 s^ile *-x i.-s ...ae i (.asxi to mi.. rrjiL'o -j'.crage r^iiirements and to cen- 
tralize general she irrrcrnuticf:. In oak;: fxi /nii-jniii-ij at^.;u:;^i /liquiramsnts iliCir iirid ii*. Oixler to ma^'.a the news 
retrie^l system mcv^j f*a*{ible anti v73iac;.Tt. the fsxiractccl .N.I<r:J nr« r.uvi corrpi.3d to remote rsdu.idant links, multiple 
visits to the samtj sita. and Ihc I ke. This c<;cjrs in step nvi lh€j resulting caT»j:{i:ecJ rules become the first part of 

25 personal-newsi3iXr/ila 1 9. 

Alternatively, parscaul-naws-prcfilo ^Klily 1c r:iciy \ i.%voKor:t a gjaohiCvh Ha^: .nt^.rlace wnicl'i allows a user to 
edit a previously stDrei> ptinicml-nBvu-iJHJMa or ti, opu: r • dcxk: H^rfi ccTfip€ui>;vr. prt-erercss. tor example, by speci- 
fying news sites, headline articies oniy. k*nv/ords, eli:. in £\ih^ cz^sf?. tne resuit is Dersonal-r eivs-profile 19, which com- 
prises a listing of Weo site poinisrs as ivih as ux.^^cto i L iai 'or trdv^/siry r.»/:".jgh a ^fkb 5>r:e or yitss. 

sa For a better unJersiarding tKo ai: w^'e, oUiijp.^ :k.i*<..viu;.6 •prof.!ei> li.w i^arii. I i er.«s protlies are provided in 
Appendix 1 as noted abc^^j. 

Next, operation p»ooeuos vj gW^s^ a!i ir^i. Uii Cy^.v . i r: ^c. y ii oustrin ivm'spiirer teiriplate, as shown in Figure 
5B. In step S51 1 . it 15 cI&iivrnirHi! :? ?.i '.v-:i ^ .i. I \:^u'. UGfintd -r;>. i. pcriOiVJ-i'ev^s-profile 19. If 

a newspaper terrplats h-c-it, bus. dL"in^, s-s^ ^: .i c k?.- !: o option ti. u'il ». « {jMplatu or to proceed to step 
35 S520. If the user chooso.s t: adil S'^e (Gini.jci"-; m i r. .u>va.j:7' !tsiiip»ii-&? r.-.s i;-;©*. d3l.(i€*d. f'ov/ proceeds to step 
S513. 

Step S51 3 gives vl-»2'jsyfthi} opto* a^uuatl ig a 'j>)'Or.<i n ,ji-i\5ciusiji2i\tfi '.1^^ fitciii-mplate. If IJia user wants 
to use a prolrfineil teinjjlate. slsp gets the 5ip«jcl'5i.i prtcJefinecl tempfet: whic/ j Is added to the personal-news- 
profile instep S5ia D'htiv/iij, flov^ p-a^tU.s'A. tji-;-- '1 •.. x . ja? ^o-n^m Liii?.: ' :v' 1/ .oloid. 

40 Formal editor 39 has d gicijjhic^i jco- ...'•ffia:.} . ^ inu i/i-s; v/:ii . .n' f.^ero/ \ormi.tting options. Instep 

S516. format edito.* 3iJ allCiWsi .::*e ise.' ii; I'-rn Dwiicf; i awr^Ki-i: v; sicctions iir-. to bo printtic in tha newspaper, which 
Web site's newvs ardds are tc i.;.-: pIs:cGu ii £^=::^?*l ^ ^.vio: , r 'i::x s uu uac;i ; '.i its uc: 'a.d c.j^. in •*h:s regard, the user 
can specify which V.a b sile's ne-«u a- . d-'- i cj -u ti? -t/ usrf . , ic-. it sr.: u 5^:. un u Vt^fc'j i.: :*3 i iu./s arlicles are to be used 
as a business pag-:.. ^A^ich Web site'r» nea'is ■c.r'jd.&:i ar^v Iv c:e riKi ut; a spor's cigft, ct!:^. Ir. tidciition, in step S516. the 

45 user can define w* 12. e-xh t/J 3.<» Ji; .j^j 5 b h !ivt I, l. i of «. ? c/^'^iO l . r liiLr g: .:i :o j.ii go on sach page. 

In step S517. :osi^mt JidKc. 'i£ ai't>.«, li u t:; w ..i lii - l,: Ji:v*^;i re . utb- headings, bylines 

and actual text o rj/Ji't* lii / So. 3, iv. i.f.u.^. C'V -0 ItSi ; ,a a 'riiax^'lieading colors, title 

colors, etc. In thi6 recoil d, !u>c J. 3:li:o. US ii-. uum!:>.o w .u .^'iv. uit c^yiixi »*;c^^:^^l.•u ixius srv'gilKbietotheuser 
based on the systerr/s srintii' 'japabilfiieu. 

50 Onceal!oftliD5;j?o ,T:£i:on 5gf:lh.';.f'.J ' J t. ; .L.'t. .-i: uotornv' (Kl; .j.'±- i;r irfiO iTaliun to personal- 
news-profile 19 .stotj 3519. AUejnativ^iH'y jnofiii; 'r.r. / ^ S' C • niji - iiio n**'^: il e ..usuC '.i •onnat as l template in a 
common area for cse 3th J. 'i. ^ "•r.'E.ie ..Ij -.-.•.Vi. t vr.-:: r: .-ru*; K'.t-smplate is stored in per- 
sonal-news-profile 

lnstepS520. pu*.hjr.J-/itiiT-p :"Jii o-l^f or; ..;ft oselai • ^ar-.'^i.: rc.'^'E^is^par delivery time and 
55 melhod(i.e..printoreitaricnC'isKi] ..c^T •a.'S'.t.i *• j -i:;.^ u - ; ihal-ri^*\.5*profile 19, More 
specifically, in ttiec'j;;? 11 ;u s.j3Qr'tr0c. :tuC. Jr.. . i..? ,i3.ii.<' ^i j wj. -5 liieval system can 
l>e launched auton^a v:cill/ ta K d^ti'^'.s'-' :i*j 3 j.n v " : :o. i3v= ^r:k:;.:- ''<im ; 10 b /'.V nil.jsvjhich are listed in 

personal-news'piti'ilsl." J;-/ o-.i.'fc rv I - ! .U : jiLLs^ui':* r.v.uci-ju J.n'he.\6A'5papertem- 
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plate in personal-news-prcfile 19. The 1crr^.tt€c- perS'::.naiu r. ri/j ,pape-' can tha:. bs ;3f printed or stored for later 
viewing. In the case that a sis iotfcxt")! r^i ipat " Js. 'i;/. -leuH 'r;s>i ixfj> :\ : \Vi'?r?A:s retriaol system 
program at any time. 

Once personal-n&ws-prcMtia has besi ::calej. vhe* 'jV'£0 navy© eirjevr^! a^stcr... upcr. bang launched, can 
5 traverse Web news sites ar.d bulla a ptTSon^j new^ir^:,*: .ii-.-;om.ili3alIy rt.i imu wir dus news articles from the 
Web news srtes and pnrri the newo artlCiCS ba^e-J on tna .i; ./spai er ten^pfste irdi j«ite:; if. ps so/iai-news-profiie 19. A 
description of how the Web news retrieveii sysiern of th^ Ini/gn^ai ^erfbrns this T*j« tdtc.i s dtSDribed next. 

Retrievinq a Document Usino a Peiisonal-hfews-Prof iie 

ic 

Figure 6 is a represantational blod< diagrar.'^ as itiii nru ;t er b / «fvhlch the irr^ er ^::n . rxieves articles from the Web 
according to personai-neA^-profife 19. (Figure 5 ulsc siowi ii .i r.tinMer by which ths retrifnrao articles are flattened into 
a linear document and formatted. T7i6se tururijcns are cii^icujs*; ikS in greater detail in ;hs nect section of this application.) 

As shown in Figu e 6, V^eb prinl er i 7 i5> i •isp»:?.isit i^ V-jr twui^-vif i.j ntAVa ar'ddes. V*/h43 fjnrrler i V <s an end-user appli- 
15 cation that communlcaies wilh pe:sDna.-nftv/s-.jjc>;iiivc»/ it; ^i.e ^ ^;file iO. V»ei; ftruaw <y-^, ojtjjut imeriiace 40 in 
order to perform this funchon. 

Web printer 1 7 Ico'rs at pufsoriiil-ii:2v/£*pi cvile li. lu ri«j(^5 niifio vvhicV. *'/*ib »lwS .V' uu-es^ ar.cJ v^/iich* data to retrieve 
from those sites. Web pfiiitiir 17 a*so k.0 '& ^: ^von c CIV sjC; 'vral Lile .i:Vo.'.ML'.'.cin. Aw:.cic ri^;^ to !he irrformation In 
personat-news-profile a.iJ s.\t pr-^Ma JO. VVic prir.u..' • nh^.\ .;te Wi>b res.cu v.4 :.) j».v:u::;i to t.^e /i/eo via Web 
20 server 35 in order to sa.x=ss v^:iria,'S Web uitrs r.nd to raukx ^ rom uitci *Vd: t a^i^i j/ Gi4 sonds the retrieved 
data to Web printer 1 7, and Web printer 1 7 usee xh© data to Duld an extfc.cted data a e«. M wirl discussed in greater 
detail in the next section c: the appt csiicn. V^'j r.ri«iler 1 7 ':hrM flirttens \he extruicUd dura irea into a linear document 
and formats the linear docum^^nt for cu^ul via output interface} 40. 

As shown in Figure 6, \N^b p;in'.er '7 n^iiutjs fciu. :.w f».odul5o W^j irAOhr i } 5D. site d/iver 51. tree 
2$ nianager 41 . and formatter 4£ . 

Web reader interhav:© 5G. lilce Vi«et "Qt&ii ':tt9:?£:o 37 t:j-:,:nixd ab-^/a. inlen'nies Yhjb primer ".7 to Web reader 

34. 

Sfte driver 51 accey&f^ zfa p? d.lu i; J uxi p "-Ji^cji iM-nvrtiVri::/' it 1 £ a;vd p£Ov.ci.:... datr. ^tCl Itiurfiin to W^b reader 
34. As noted above. Wdb >oadler 34 uses that data, to access various Web sites ami to exireict dsifa therofrom. As noted 
30 above, this retrieved duia I'. u-.€d b, WCj p ihlii: 37 :c i-ulii u:i :iCiuJ ilala U.c 

Tree manager rniijages Ihr i -..oi'. u.ir, •> r-r^i^ irii r.it fiaj^;-.; > ; ^.i-pu lic.rK oi ihs o.ganization 

of the retrieved data iin tha ik'a'iul-a:! ciai =: €i:;t. " . .i. ? tiaui.'Si . j t. ^ ; lar '< V :0 a ."^ki u« uj^. es r ;t 5c.mc tide twice, to 
avod unnecessarily re-visiling £ '.^/eb eta, c.rKj tr. avoiii i3C:C '•'-^i-Q^^'^ ^7^'^^ * '-rgc;n;i:i.fcoti of a hyper- 

media document on the Wob. Alleinatively. tr^ia rnrinager > rovt store <hs data !:. clodcij i;ai> irpposecJ to directly in a 
35 data tree) with referer^ce to a lirtf'JBd list tnal j.jcvxa>lhie s&rfLv fui'*;/jonalily as X:\^ ^Mt(£«ri«id dnJa i/ee. Sa^nple code for 
tree manager 41 is inciidsd in /^4:pendi:c 33 

Formatter 42 is re;,por.:i::&3 ^^r "atsin' o tf je. s^lrcC J .cd ..-e irto £ in-Jwi uuu . Jttt c/.c! brr attifuj the linear 
document into a persona: zed n^Aspaper. Fornici^ior 42 pi-rfc. rns. i-.ese function^-. » \ &ccr,.v.sr.!;6 with the print criteria 
and format information (i. li.. .^fcv/;:p2?^:'ar tcrtipy;.;?) ij t*'C'^iJv».i ■ .>3. :< nsi-'/.'^/iS-pioiV . :3. . rj o'Odti rcr fo/'.riatter 42 
40 is included in Appendix 38. 

In more detail. Figu'*e Ti3 3f:L*vdi:.^ri/.'dj3 j:L/. t^ ^t:.t' i\\^V .orinto <7 w.- ^ . ri ujj* ./avertjs the Web 
according to persona*-:T€«Ai viofil'j V? ij ^'einovee.: it::3Sf'?:rr t i3 Wc?) s xo'.dir.!, »c V'.- pi o'*H. uxc'. jdng unwanted 
data. 

The Web printer s.c'rts ti sief S7C". : t 1, ?»f. : ; ' »r .si.' ac . iiJu . ; .i i z!tiiyrrv'i€c cersonal- 

45 news-profile or a defeuft psiox J n-.^-'iSi ;)i jfi'B 5>a;'3v1 \ : .iis: ; T. usi '.j J.'* o ; ' . rrurJ. '.HjOduse com- 
puter equipment 1 mayi>i ♦jsi-d b; muri on^i UJiur, du. ; ; 'ey one or nioi L i a /.^: .i..v''-i"}'of'I?i<.sxred on the 
equipment, one of wt.ich \mn b<. drjiigr sXtiZ sci the d i)f;*iCi Lfp; t rftievirc) J-x^ d'.iX/.»';.cd a.'.; -.{. -n^ LVi-prcrfile, in step 
S702 Web printer 1 7 delt;»rr irv: ;r / nc*'. * <^ata I'si;: ^icer prev:ta«;Jy it - li\ d . c (for ii.<:3ir45le, t^y a 

previous automatic n.wo r-'Hlivjr/ r r if r ev.-j articles sf'oulc' Ij.; <'3ii*i€ved using pGH'soifil-riews-prcfile 19. 
so In the case that nr.- s i'^ifi ^t^n uvi; ' ' >.> "^03 .he s.Oi:i i i.^^.u X. i ; .V*. it? crd^lcw pro- 

ceeds to step S801 5"f{jt;r:-' 3. rj =»a«:si2d 'n jthu: ■ 'ati-il r ; i.tt:.'^ sec^i^r. Or» iht ote ii'-m. if ntoracl news data 
exisis, Web printer 17 i:T\;:ku:; V.'^h --Gsri ji ?• . ; i i5i ^ ' ^is! lif j is ifr: • r '.V.L Lc;-cr 34 asi c'iscussed 
above with respect to fiMining a peri^nal-n<3vu£.'pra dfi. 

Upon being irwoJ.-.:!, WebrwaJ,,* n.e„t . Z', nsu. i**^". it-- ' sctior. :oanet- 

55 worKsuchastheWc.:c7rj'ciWe:3- ';Vi;j pitrt^^ ^;j..i.w'i.v. ■"eb::3i:'sj.'3' .'.'i^l* c i : \\j ^ f:»rc't5 first V7eb site 

to be visited based or hfor rcJo.i rsbicwuc frci. .i-csvcuii-r:*. :fi!e Diue ^-i ij* to desired Web site in 
stepS706. Web printer ^.o/.it-j A'lt. rud-r .... « . link-. L. ti.. i...:; . !<, '-.iLi Ij'Ju.r.ciC V/eb page 

containing infomnaliOi7 u.a,':its-/,wiI-..€. s ^:iil.<: 'J .rJ^j.. . ^\:..i.d • Iri^vi J. vV.. ; ^ jc. C'. Ji./u^di the Web 
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according to this rnformaton in step STOy. 

lnstepS708 V^/p± reader 31 retrieves iiec's^'art ^^:*^ .n* 'inn s ' J^S'tt'ci, : * ; ,;':".:r l7cCCordingtotherules 
In personal-nesws-pfcfie 19. Thjs. dat?. :?cc '^ . is sti 'n:*i ri:>:r> ;r iti-r^iV^s-profile 19 specify 

structuraJ andconti»nl>r.~ed cn:3r!i^: oi'i.'i d:r[j ..r. £ ' 'ir T.i per5 ."3!ifoci • aA-sr, ^ra* T':e £tri.iaj»*al rules limit the 

£ retrieved informaticn a ''€:3b?.si3 0. :h:st t: u' r tio'Vet 'laatcce* ^; V j/is^-i^. .4. The ccntsnt-basedrules 
limitthe retrieved ir.fanTA^ian jr.si:^ 25 c^^^ i !ion:;!ubr.3'./t .. >^ -* rrsafngs personal-news- 

profile, examples of the s/:itax cf the re^ieva' nj es in pj /oorE'-news-profilfc 1C arr Ini^u^ieiJ h /^opendix 2. 

In addition to rui5'based eKckision, Tedlt'-typcr excSv.aon occurs in s.zp vxherGin rfata of a media type that 
can not be prinlsd is exciud'Jd fr^jm \\\ > r 5ctrs;Ci8Cl data V?. i. For fixs.»T$iI&, movl-^ and zcur,.*! data can be excluded. 

w Web printer 17 s-or-i^s thie ret'isvifd data In dtskdriie -!i (cr in mai i meniD-y 14) in viiG ©.traded data tree managed 
by tree manager 41 . Alter natively, tha c= t3 coLlii be i^-'re^ in n jcte \»ith relere ice to ^ linAed hsl, 3S discussed earlier. 
In step S709. Web printer i7 returns to fstep S707 ro cc. i^kltt reiiieving aS ir-f^Drmation rrom Web pages at the Web 
site. In step S7io. upon conplethg a traversal :f on^ Ah^ s^te. Wiib printer 17 u^s tttx- .-nsnager 41 to compare the 
sites remaining lnpef soiiiil-r>eva-f:i d5!cJ i? u ithlh9!;iteo oanizEitio'i intormatior in1ht> ex-lractad data tree to determine 

15 if more sites need :o 'j3 \ i.i tt .f . * Ji. r.. r ' a.. ... . i neoJ '.c !3l . J, j.u,: G'^ ; raa. '.s iic#v to st^ S706 

and news artides a:^ i t lrie/eo ir* ^Jric . : .an.-. ir l:- ..iw...»c;s..isj ato 'S. Ci 1 1^. ; ..il*. -:r . a). Oi Wab sites Osted 
in personal-news-fiodilo \9 fmvj bL.3.1 i..T I 2: cf a. lilSiS . :.r;!iVu,-J. ii.u ic:fciiidj io srep 3301 in Figure 8. 

Flattening and FctiimLt-jnu \a& :',etri^.ju Da^i 



Figure 8 is a flrj^i c^agriTTn slia.v.r.^ hOi\ f * e Gxlractnd iua ires :t r.a\t if;aci c na 7unn3.i.jd. The configuration of the 
invention is the &«nti. Jis ivir.en rDtrit^mu da^ia C-uni tl i£ Ivab (c.hov;n in Figu:a -3). .n Uct. t;is f latte^'i-ng and formatting 
processes can occur, al iGauttc a iinjtec exlsjii, cor.c..i..-;'Tr]y with ti-isdsita i'i:t/i<2wa! p cce^i. 

In step S801 of Figure &, Mc aciud ciati: t:<5^ >L.u£jicri This. Sinrpty ntc«an& -:r:at t .e u/ganizaltcn of the data is 
25 converted from an ijmacied daia Irea I'y a linear dot;w.J TuI \. nm step provides l!ie opjcrtunity lor excSurling more data 
from the personsliziiu nev-TEp^p- ,r, Cur exj-nple C: i- Jjr.e -.locfcti cf L^u da/u Ire^ ./ito *Jic# fidi*le.ied aocument. This 
exclusion procass is cexrollec I:> Intj ualtcniirii;;, i ul?.; i:i rJ^i/txind-nM^'i;- profile * 9. 

After the data .ii^ '.^dcriUL. inlo a IXii^ui ^ucsii-jUd/.:, j..: Jalt, iafoii^iiitt^^J i/j ^'..p 1:^. *J ciUad.riCi Ic "att lemplate indi- 
cated in personal-neivfrprtfjle a. 7h<i m ".k;;. d liL .c.Ai ata, W.,iJ\ li> n^l],^: c. neci ienvlaife or a custom 

30 template, was disc jsstd suj dn: R. ai y, <.! jp c iu i, iri •.. ' >. . i ..tLtd t a vi;.iy • l-! J , is, sent to output 

internee 40. This interface coutc be orit^afintiBtlaas iOt..pMj.t'«/7.dri.:«.ay irLeiiaoi! t i to ciroptay 2, or even nrKXjem^ 
internee 6. 



The second emiSOQifritjnl cl tut^ ii Ajntijn is ^ iv:»:c-jHi for prxiauciivji a h/P&ri'iicidia docLiment. The system 
accesses the hypermedia coa:in£Hi. a»j.r5i(;tii 3Lt'J:v. *«.: ixr* the I .^drnter^a Goca'iien:, und storii; the addresses 
extracted from th^ VP'->J •'-^'Cii^ Jocjii.iirt i;i l co.t'iiL'u. hi^ o/slb.^i iicHvaifis a p\'L»j.:'oi'nj fuficlicn *') process data 
stored at the addre^.J Ht; z?.cri.t: . 1 ihi: rji -a n .r, ocir. ..lc..:.. , 1, Ji^ta ^Itr-r-nJ .a tS? ci±li .ii;.j3£. ilD'tid if. me container into 

40 a memory, and extc.c-;> p'O.l^.'lifr.Vi.if.L d^^{ l\jr. . ::.;n:..:..»iC diiti tr. sX-CiCance iwih redetermined configuration 
information. The pi-j<icitiri/.it.J dti'a ij .liii. /..'..'.jctcaU «i „:c.i7Janvj iflili (.it-j. .nv.: jr 7.c.fjAj ieii<r*ga to generated 
formatted docunici 1, afiJ tjti: .oriviSinu:; c^.^^ r,-jx is!.i i^^. iH c....L/ric u*^ • .li^ Jt.; ai..-. ^ij.uij iir.GUjM. 

Thesecondera:oJi.i.i;r.^;iJi£::nv:«lwi.;s*:.-...Jt^ >J All'X uuiierlh, rA.^ nVx.:--:!. exair.ple of HTML 
formatter 18 is Wsa-orniiiLe.. rnavj:jiti:u£'.* b, }6nc -../.n laticr {1>£.Il».mj irv; U*,' i-i0zr6 ernbcdiment will be 

45 desaibed with respe*.;! tc v^ebromuu^w H ^. oujd fc;i! no 3C), ht.: vever. i^hat KH^i, fo . ribdier ■ 6 is not limited to the Web- 
Formatter embodiment, and Ihfit v^nouh eStemAi'm enit^Xi\fmui& withrfi thespiii . and scopeof the following description 
are possible. 

WebFormaiter is s.and-aIorie L.ttiiy cic.V.th s tiiSv ouh bs in a)niuiK:ic.n wi t, diifei cm Web browsers, such as 
Netscape. Mosaic s^rc: !r.U*rni;i Lict.j.i. ^/Vjv : rntSuf e)c:i'-:c:S 'J<'Ja* or ?♦ iVili ijiage. strips out extenipora- 

so neous data from £» c-x.v cie.^ dji' ji, £»>:.* ? efa'f. t*i> r ' *: la li. :o a k rn\n\X'Z:\ J jcuiriL/i-. . ht ju;:null<J document can 
then be printed, ciloi ?d *fj r.n V.'F '.r.ich 7c. :i Fj^.. at) ' >i= kc'ito:) r i:riy if-.T: joi:'ijji:{i». !iJ Miilcr, SLCh as MS Word. 
WordPerfect. Word, jao ^L. 

WebFormalt^j c>:t i f j.i v ft" t j.'- r ^ i.. '' ^ \ -mcY' li « io; 5 .^ V'rJi:.u.^-\ rran such a win- 

dovwng environm^.U Vvc.'.)!'u .iiatti:! :;. 1. c i I /..l i! c. ;. .= dic«*v.ic n ' v. v ric I.:.' /uiOh^M-n) inastart- 
55 up vwndow. selsc^raj ^ivvTj r.!af.-:r v \ i \ J.' 1 k : .nj, ■.. .^i-u .1 i.j ^xi j . ic: ^.lor) loon (not 
shown) from a Ws^ u.o^.sij: . .ro c i .'i V'.' :..;iuj' c/ • . . ..ui , . ' , IwibFormatter 

when the Web bro vc^i li sU- .s i. 

Unlike the fire- 1 t.-fu*. 'i'r.^iit ur * »u ir.". > kj'. Ut'::JiiL '.ri alo.x-. 'r^/'3:f;'urnati-?f Ct-;. :>i uji a preaelinsd personal- 
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news-prof fle to specify Ciiteria for creatirg a psiticua. t/p= cl do.'n.ims«it from or.& o: iTKi/e vVod oages. Rather. Web- 
Formatter relies upon i.sei-spedfie'j crite;!;. to create a partiailai t/p^ c»f docijmsm. SvCJi as a riaufspaper or the like, 
from one or more Web oages. IT^eje crrt9/»i?. aie i'.:>ir( iiTt*?''a; uvt y oy a user vei a rrit:. i stW^ei interface. 

As described in nicre detail heftw/. W6bFor:nr:'e^ To^r li .ir. r ti.o mo&fis - a rr ;»rrr-' ir -de and a fulfy-functional 

5 mode In the minimized moda, W3bI-orrina.t2r*£ rjrEcn r ' u«'^ - t^cs fi^.sen^itfs;' ' r ♦ or t» to -nrt ixtlcn. which is dis- 
played concurrently with dispiayed Web ps 33s, Sy /ir» 9 oi t:* :s fc r.u'e B3 ^ iter r rj.cres * h^ h'^ tie user can proc- 
ess, format, and print cut Web pac^s by m9*e)y or. Str vjeJfng : irt butlor 

In Its fully-functional mcxle. W€:bF(»rnvi*te''5 giaptiicai s«3 ' in ■arf?i:>D orcvide': i^-:c€o icr a user to enter a URL 
address of a Web page to be proces^sed. ?nt e'' a personal! Ji' .e for thia dccument, s r ikt; a h^n^ 2 fo; the document, pre- 

Ji? view a formatted finsl pag© of the dnix/Tanl. and *-itf^er p -"k th«* docunent savi the 'locunsnt as an RTF file, or 
view/edrt the document usng an R'T eoitnr The s!.-apl".iKil vjr r-.tai-facs for ths fu'y-funcicral TOde will be described 
first, since it is from that int^rtace thav vhe un^- cai . wnJai' ^ ".;::ir .ized mode. 

Figure 9A shows graahical user »nteinc«r 43 for c.ttS! '« funy-fupctions' mc:3e. CSiischical user Interface 43 

is displayed on display ?. juonfifi,i activa::^T of Vgftcif-ort .'jitai '^yiiha^y Une*sz\\^te'^\movs\ng software application. 

15 a user interacts with graphical user iiTLicsrfrr.i; A3 b> tieaiiG crt nousii 4 (1;^ poimiiv,a ^^nd »;-icl«nG^ a»Td kayooard 3. 

As shown in Figure 9^^ graf;)hic4ii us^ r r. inac^r -sT.i incu ur;?^ /i.- ds 44 li vJ 46 to /.'K ^ ;..a ;^!'. ^-vhkMi ^ user can specify 
the URL address of ad rjuirientlc. im fO'iMeiteci sr i ti^-- 'i' .1 :f '':»i*:r.'c':»/rr.eiY«. S-y'/nir-i - iih j:-Lf;eic'44. a user 
emers the URL address iBg,hitr.vttv/;/.(::iC3^c.'' :o.iM-i vs.rr.'^i; hih' c'oicnsudby WebFor- 

matter. There are sever?;: rvc.yJ:' ^oi tv^ 'jr-j /j . • . ijHf . cidrci.i "t^ u. b; C:r ^'l) rype t;ie address 

so directly into URL fielc ''i [2] it:<iV.RL ii.i'yes.i I'l&'Vot ^vvj ^u^-i lie iJI*!i. andress into URLfield 44, 
(3) drag the URL add? £?.s ifd^A tha Web bictf^cai orrr: -traph.rai utc :nx»race ^3 0: 'ar^ciKn V;aS:'Forr/a'{le( icon, or (4) 
click on Current URL bol^o : 3^. 

With regard to Currc-m U»" Lbjt>on :i $. ^'.-er Ct.r iPt UF.L.:xi'itcr. C -, Ws: "'j-i -":£sr .'oc-utes the active 

Web browser arKi qua ' 6r:-; «rie Wet fc^.D ^3C^. fci thf* a'id'e-r^ hi* : jrijr >A'ei; jmct Tmm-i yJiir, tl'.« A'ab browser pro- 

25 videsthe address of vh • ::LrreTlWtit pa^e l:iVVc::i-Lr^ Su-.^i iijltji^aces !a'> address in URL addre&s field 44 If URL 
button 54 is activated ar.c iio J Veb t.rcw/a:5* i-: Ci!rro:Ui: ' •! ti* i ., V/:al:Fc.'na!ier disp.ayj :r.2lor box 55. shovm in Figure 
9A. 

As shown, dialog bxi 5L :iuludes C-'.^;6; Jj-..ur. 57 ci'i: l.a..Pwh L^r^/^iar 'xa\«". c9 Ca Luu-jn b'^ cancels a 
user's request to inpula u:IL czddf sc^ ur.j J/IL •iciMi.J:: ^i: i! ^ i' C-r JPLbul o.i -5 uu.l!* Ere/;st3r button 59. 
30 on the other hand, ltur.:hi£i Wei- trcv,.-C' r^:K,^1ii !i. . '^(rJ/rrwtter I- configured 

beforehand with pitdsV. 11 j rfJii .£::or;:'v.u *«: :jC;M.jii ;.:t; r..;r: . ':c.r.:.;:.'3;ton of Web- 

Formatter will be desurbid.'i: L.i.Mj5tar. 

In aKer native en ibcc^invr.t- :/ /.'3brcr;.!- ; ?, .i:.:^; " u. J..':>^ 3 -i x' ! ' ijSV. arj<!:f3Gic fieid 44. For exam- 
ple, in these alternatt/G c TiaUiinuVlj. '! aujj* c fun n r.'pe<'i;''!:i>j ;. a i-a! !.ila,a!;':c»!v:iU^/fcrmat, the user 
35 enters the filename imc U.^ rJdres^ frold ■<^<. TtorevAUv, Vi'^;i;!-VjriT:sKerpjocec'cis .iiffDugli thS; fi!^ in the sairte manner 
as through specified Wtb i.^iv»5 in .''idt . ^ . e^ct . .:at il\'j 'Ti'^. =: !!.i.-.:t:i narta! J ...;ir:..'. 

Returning to grsphlc^il u:*.5/ iiTkirJc : ' * . ^it:i.;f.f? 4-.* r./". / i i.ser 0 'j:t: :i t c^^jw::';! - . jfii ^0: 3. •.o.rrtalled doc- 
ument. The title may br/typu: Jircjliy 01 .:?ia:bX ; ;x til: * 

Formatting fieids47?D iSazriv't ti frrricu oi * u:{:*.f !c i • ciiifKi-. v W.-i-'"~»yr',7^v. . v;1io.is i.i iltedifferem 
40 formatting fields can be aot. QHB&d by clicl:;.ic on a uiroil Imr. i . ch f is sere I tia :•>>. of a rr^sf cciivs formatting field. Each 
of these fields Is descri o:i i: .U.u. r b-iio* . 

Styles field 47 prov fx'' ^t:;r^ foi ,v:ii.3'. fon» • V .1 luni** Tfi^s-.'s /i,-:.- 3 ^ - 1:* ff»vri«:iiribiicsof an 
output document suC'i s.-. c * \i,Jc: » 11. ,/ .1 . • 'i. 71 . . » r*.-; ..;;ut ' • j, r^. . . • ui. ftri and Pro- 

fessional. Theinven5c.. r'< :i u^. . nu! V"* a J« r * ;; \'V v '.'^ . . . < n cc u.'M .-i .5d3Sired. 
45 Columnsfiefd'lfi dsfirtBt 0 I . < • (! Yr-ir Hlnr ;a:. j guiim: ::cc-/ifii!. if... vi)jr.rt> oplicns are availa- 
ble - Single and Mul!jp.ti; hiw i .,"e:v( r l' I 1 :c \. ::f^:: * ':i..\u'j^. V.i yi.': 3,:4;:"i. a^j might be 
expected, fornnats tht do.:-L ; V. ir x = -cXfJ.^ . ' V ^ . :o* . . a It:. f ra; -J ii; .Juc jment into 
a predetermined nurn^s. 'J >. In .orefint:d cj.ic.^di:;. if^r.; ;r»v?rit'c.i, h\s 'Jip't: t^itoi !s S3l to two col- 
umns: however, any p;.rij>r:- n fc3 
so Spadngfield49cl6f!^r:;V:ei,p<ij':.^.. jt.'. J,.- 11 zi'.' ) r ). - ...ict.s s/Sivrovidedin 
WebFormatter. tjul oth -.r cp : ? • : :■ c- ; ' . . .i • j rea -pjioiis. tj r. Co^idar^jsekj Wormal and Easy To 
Read, with Condentc<:bijiPii Ifcasl i.-...;rri r > 'n:< »: >f:i\3 l^c- n:u£i amount of 
spacing between linas 

Graphical user inUij'r'js 'V. i ^.^.KfOj6 c-. r. 3^. h;.{:;.;.'i SC-. dick-v, cri ;:i€/}iiiv ju.iiC., HO. a user can 
55 preview a first page of 3. !i j i'j . « I' e^jPivw - - '• ' : •.»:.:: j J jcument Is 

shown in Figure 9A. 

AsshowninRQw.rr: : ^ 7/:., .1. . . ... ".viiitvlu. . : % ... tb Oi/ii^s^auser 

with additional formaltiiVa - ^.;^Jb; .♦A=L-v. '." . V:. ,0.:. ...v.edL* .:c;LiLi^..c;ii. wi.; A jcU t-ti.i activate 
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Options button 61 by ddnng theracr. "^iF cai*?*; OrVr • i:r:'ng tv?^ 5:2 ^nr- • i "-o to apoe^ on display 2. 

As shown in Figure ^'B, oplxns d n eg ba< i v.'ir. 2 risra' op^on3 S^' . Cc - i ^sr opuo.-^ 66 and Strip Meta 
Info options 67. Gen:-?* cptbns 6^ rr'vt'rs "'^a. a: I .V:»ir ^2. "'rzt^x cf • ; h • _ Ife&::< 73, and "No float- 
ir^ pictures" listbox 7 1 Thaseop*xn\ "jf :Td:fV.1:.£ ' .-:3r-7*n6^b:,M;',h-; nr»*^c.'''r >.ci^i fsspective listbox. 
£ As will become c!e£»".TOv.'^,e:J'riEScr;;*:m, n if:te; iOl'*-!; oTiarjs.'i.GK tric* 'vtisn 5 0/^c:Lri be selected at the 
same time. 

Text only- 72 iirt'u'::^s *A!.^ :F';r-i-:i r to ^\ I /rinhic? ' : '.Vet ^^li r"! .ri*^ un!/ t:x* therein. "Index 
of links in the pggs' Hsibo 71 rrstruc?:; V/ebr-i-nstie'' id.' ? 3f i i U.^iU jrtsi^ -.t IVch i>5ge or pages to the 
end of a formatted dccur-rtrrt. ?: eferc :.'>, th.n (i?.! I Rl.'. . , prr ■ Bd as iupers;;'.; ui\ • iv. ho' pcfJlions of the URLs in 

10 the list are marked in bold. "No *te1n^. p cluriis ' isJx:. 71 :ns::tictj V/ebFormi ^-.e- 1: in: all images in the document 
in a particular area of the forr^ta led do^umsnt. \n szn ^ues liierefore, wht: 'iT:i c^&'ion is sel&Med: WebFormatter 
shrinks images, as nerdcJ s;: 1 -at nu-jsj t] ut^ a pc/ v.ia v ^ 

Strip Meta Into opi.ons 57 proviccj t?na;*23.in;; o'^: 3/.;^ ivMsh fecLtete strlppi"g ur.rseces^ry information from 
a page beinij pf{A;t'$it J by 'A tb?" r, tdUc ! . *> V b ..i t:, : Ul-os , 1 y ~i Jc r* c' , i; i.-w ^Jis Wetrof matter to strip 

15 nothing from the Wei: pa«c. (i; "';TI v.c 'n'tl .ui^.onLiu > . i\i..cl\ . y/^^ir'a: ;\tVirJ.' lo islrip aii !irii;s and images 

until and uptoprect.V..d ' -lUarro ubx^v: •.^ri.'.UiTu^i .>/rt ..u»ti. uleii , J^i .. *iil d i^r ^jn.a' L.u- acicss a page), and 
(3) Till the first toxt*', Ahicli i.vj^.'-cis v iftj iiuLcr tj . . ilJ .* .ui .i'lV:^, jp '.l. .iiJ-i ij J laat ax^r/ences of text 
in the Web page. Ciiy o:.o of £;.!p!fr :;ai i :!bopl.aiu^i v. :tr.h'i jDy*.tbJ.3i ./rtw*. &v,U-:;o*. tr«;rtof is indicated by a 
dot in a bullet located re>:t ty tr. epic •.. as £ hLW. in r^/.-- - SB 

» Container oplicu 33 j^.c*;.: 3 o .^.focecJC.. c, .:c ,i. rf-^\xi, s.Jc. b:*Kj.i;j n.r .A.ivh o.;c» s^l^^Jud in container 76 

shown in Figure 5B P'-j^r ;r, uiicrit:!.'- :^ C^./ji^r ti .iJ:. m -Ji^r.-ucn cf ^liiV^.iu./ 7'. vaI!! be pi'^-^iJ^od. 

As noted, co.'itairiiir / y ttrr-SRi ur... f.dc.iT;>i.fi5 u Uu. ui.iei A'o. l#xc»^-.:g. t i.ci:*\, ••sa^ Viiiich aiG input to field 

44 are added to corjtuinc:' 7c. l?ie orJer ir. vrWisr -iriU s./'S i.-;;«t ifito wiVwuirw?' 7<5 cK-.naxou In^? crdsr in which data in 
the URLs is proceu'SJiC :.y Vte/.:.?r.ii:t:o'. M r- :n .v-.jru m>. chl^ coi lUii.ier > 5 Li:ccrr.i3s full, i-ls icon changes to 

as that shown by rei-jranca r.iJir.L.\ J 77 . 

When a user diCu:, Oi^ the vzoa '.u C04?':4i.r.o V; <riL:.u 77 d?V'^i^^- '^'^.f^L.' 77 jjr:?;.Ciis fivs r/Oticns: i.e., Open 
79, Empty 80. Priril ai. Ec.i El: ar c Ja^e U^. ri<e^i, j^r aun^. a o jijh^.Sv. w.Vrit^ ai^'i/dlevi, and £i»e descnbed in detail 
befow. 

Open 79, wtti.i a£,toitd. displ^Vu. Caniui-.sf jci.'.j .... u-^-syn J/tio\',.-; ;/ FigL. c: 23. ^ji**£oin47 Contents screen 
30 87 shows the URL accliess^io s!.o.xd i uA\ratt\^i , o. . : -.Ut .^i jauu.UM lit^i ee.* 3.' |.j.\.vio^.:. li/ur Oitio: js; i.e., Add cur- 
rent URL button S8 7/Hich .iticiii die v. Li.-*!. L w.;rn£ u li. * ^, ij-rii^ai ju.toi. :> atij-:.., ^.^i :mx^ asei to highlight and 
delete a URL in cor iaii^er 7 ii, :i:ni;.\y l'jtI:^:i ..^iiidi j- .r : Hiji a. -cjsr .o enpt; czr/ctiV M 7*5, ard Dene button 91 whksh 
permits a user tc c:i.>S3 Ccnta:.. Cl...c.-«3 ii^rx.i iV. w '..j it.ai & -Sij. *c . cofuafits of contains 

76 by clicking on cf>-»p=.'y a; i . 
35 In addition, the w^L.er r^<..i .^:i:n^i\;L- iSiti u:ic-r L i IL- aoiJiJ .n uuntJ.i.\u* /L o> :.'fai^:^;n£j wild cr;)pping different 
URLs at differerK iojcixri-s thcf auri. ki :0t2Ci iL-avt, iiJiij'y :h£ u»'.u^; u.! pro:tii^.fcfo U\ w uj ^i/oer thi.1 j ley appear in con- 
tainer 76, this feakn piir/rtiis- ui y^^f ic riiEi/aiKjii-^i-s iJt Cv .-s.t. jc uj'Jic It vii'.iiiir.ai' VU interactively 

Print 31, Edii 32 ar^j Ea.^^. 5, uJ^e'^ dciit/aitfr), v:...i-;i'i'/vi.ji- u.')Tat[ar to da\.vnio.=d all drolaat Wob pages defined by 
the URLs stored i;i ita;»iGi' 70. l.s.n i/J j.. -.leiih RT!' -..^ai.':] iior^nc} tiie tormatted Web 

40 pages, and do tf.e baie^iui iaJJo;.. ..li., e^iA-!. ciil c Mil t t iiV:- {.ij,s,. iiii:, m :*iscsibeci ir: g.'eater detail 

below. 

Referring t.sclv"'.j 0{/d.jfe -S;lt;,j m c?.. !'»o^::^/£ ii ii.clu.c- "i-ii: : ..il.j ii a.rUtV:&' listbax 92 and 

"Empty after proceE'j.f i.j' :''l.ou:. /-.j t Xi^r , j Jis;*. .i ui., 5 «** a ..-.:c:< { » .?i:.: a c if-ia. Hiu '.ibtbox has been 
selected. In this fty^iMl, rur- 'j\iir i .i- 'i^ix). uiii .ai;. x Ci. t.fii5. ""^-f. li ULl-^ wi. Uents' lislbox 92, when 
45 selected, instrucbi lii.' crr Hia ii f:.'i«: litV^ -. ^\ J: i {yiT\'r:i.vit: "6 v.r, o.rr ints ir. «l fc.r;nafcted output 
document "Empl) p.ccaoin^ . :!.x»jx :>i. './?it'.f ac.wc^Ci.., ti^.^iLatf ciAj.fi£.tuiiiy to empty con- 

tainer 76 after priiiiif.;^i. 'j = -uv.i.g a OJ>3wtiteru, . v.\:ul.ry Vir a u£i«r ;o c>d to. 

Also shown «: fLtii-t c« Zhj.i. Jr.:i' • ""'j> "^'s . : i / 2:Cuc; I'lut:^'.! 'jJ. ^--a-.r..--! mWo^ iO ard OK button 71. 

By clicking on Se^od R'T' i;:i:ix:.* hu:t: JL, r uic. i J.i. s.<u:*:i iii* .'i*F a'-t-ir. iKii ^ilf.-; of ivhich iare .noted above. 
so This can t)e done. ::f iiitimpit 1 ii.i./hihr K .i? .s:, p.cjLli.i.i.J ^(T-' SsIdj.g (nucthowsV; and select- 

ing one of the p.'ede;;MUc j !!:i/Cui :n ' C i..^ i. Ji Corrtiir;':! iplicnii lill ard OK buitcn 71 confirms 

selected options in CcuciU't^' Oj. tjoi ic c; ui ,t. o* li i : . . i.o <:: v 

As shown in Fic^u.z 9 J. cj;fif,i«2i.i; .sc itE.. ^ ..'..vij . 'jct^rf^;,. r.i ic:!. >\ t; . i ^jV sauc :con 39. help but- 
ton 100, done buiton i J1 and minimising ioin ' A us^*^ mdV sele^;: any or tht:ise i.ieiiures by clicking thereon using a 
55 mouse. 

Print icon SS:/p:.v a;:/ in.di:!:.: ' : \,, ' ' ^ i -.la. . " .1 . Jvj\ itLof Web pages 

formatted by Wets. ..'lall:! .',;<:■.: 'X :" .:»;• ■ s ./i ' ' v j'u \^ .x.tii.:^ jV .: ...:::j-,c. ac'ii.ny by a predeter- 
mined RTF edilcr. icwi EC .V2 . . .It . . ; . '.b i .a . ..y turic.r:»£} and savea for- 
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matted Web page as an pt? fiV -^eh ^"'^'t ^c"* 7 "^/i' : - ? ;e: f:ir omti" 3 V/sbForncfter and Done 
button 101 exits from We^'c;rn'*^^' Mr rcz v *T ' rr-n :>« b'T^rTrattFr Vihich was 

mentioned above anc" -ivhi :h is describ-^jc? f. cr:5r.tei' :.3'ii:r- ' •'cv 

FigureSCshovvs mc-;js^fcvd£d * : iV^lirOT.idtar'J^ .r-r/:.rp€ iirn.T- ss-.-; -mIu^q hla irtem ^03. edtX 

s menu 104 and windo^vrrc'^.u 106 *: r r ^"' i' ' fcf . ~' ^. . . 'j..ct'o .s of which are 

identicaltothoseof Sav*5rxn?e Ift v O'' .liP.in ! . r*r» isrwv; vU'.. ,ii:r i?^ al^f*,:n>videj to exit 

from File menu 103. Fin:!^, Pla nr-' v * ^ U - -• ^ns T^r ' ,1 'I't-" Tj^tio. • c :;'c\.id3S a oser with the 

capability to opan a be a MTI^iL^ir: i r... : fypti;rns;^;afilt '►st'^ tor ^r3t£;;'c» .crpi.-j as a file saved from 
NetScape, or URL files cjk by •lin': "rg a.nzi .'tCT-'^^inir i ;RL :/»:o tie /.ir dcsi mp "Open HTML file" option 
10 107 also provides hoo^s njeHsd \c aptn ' i.^ i cn^.^ 1 3/ r ; : * Vie-: Jii:-:irocfi^3 !:i«;:!.:uti s:i ♦/tat thosa files can be 
formatted as RTF files ivnci print*:;. ia/srJ -.r.cy'cr ct^.ec* l:..:?^ V^.-jt. '■^•rratie;. 

Mt menu 104 pro/ides URL" c. jli^r. . :;.i.i ' c^."'r! ICC o<'*.os tr»e K^Yxnfe c4 a paste buffer, 
such as a URL address copiec* tir -ii a \V::0 pc.9t, ?.«v..^ •> a? d^3scribsi t tc7(a. 

Window menu 1O6 prc\<z^ii a " ^ s-.. ' (Jptio.'r : f re ii is^ -iai // U\ i ♦Ic^rraU'i.x .eyaioiny ttie use. main- 

/5 tenance and backgrouvics c/ Vb^; ...m&.U-3» o.mJ an . vju. . ^ ^ .Mi£,v.v cpt. .. .^l:** . i.ivu^ciu. a u^nr vvitn a dialog 
box (not shown) oonu^if^nci A'etPjrma.ie. s tcfoiUiT .lutri'.-:;. and ocp/';g ' i r.otiva; >). \ /bcfo?^ irsnu 106 also includes 
"Preferences" option 1 \ U. Pr 2feierw.i£' ojtic i . lO 't* i.irioG,'. Jw.cg bc^ . ;:..^4.h r. f .gu»a SC. 

Preferences dia!.:^j 'z^x \iS u-w» ijcii.'...):.;-. .j re .;t » ..^rx .u: w . = ..!;ji.v. ir. r- l^ur-e 9D. prefer- 
erKes dialog box 1 12 ;.!j"u:22arv1:/i m-^s ..p.-.-Ji t C, • r! 1: ^\ . yii:jNiji tcj:^ options 115. 

2C Minimize view options i \ 3 i^r. l>e r,.A iz c . r ' : l j ^".'c : .1 , :t. . 'b , j^i il, .r »:;..:. w . ^ l .--.uii.f i: i-^ii ..'jode. Two 
sets of options are p!Gvi-.»€»d. Tn-^ in ui i . ^cL V*i ' "t^.vi". T; ..^3- .;pt.. n jj. ,c. : ^r>poi:d t) ►vrirrt icon 96. 

edit icon 97 and save icj:, K\ :iiic\vri i\ i.,jfa cU. < c.i',Ci«^:'i..»{A.?.ppa*j ij, i'i /isijc; '<is<tt': one of it-jsse options, 
the Icon far that opfo^ i;* cisjJjiya • i:-^ .• u. Irrui^C ntZ-i . r 1(1^: tjcl. \\ ict^r. ir. J.u* ihe i-aiVa icon. More 

than one option can ins 3e!?x*i2ii at . : xKr F:* le ^il .?;.Mr<i*i. g4:*iv:«J usur in.tiiEace 116. which tsarep- 

25 resentative example cf x cri-ptnic:.' ..."Sd i:>:ei aci* ti. ■ V*? Ji" . . .^j. v^rij.t Wo/^' V/;? uuv^f 1. ir. CA-mmz^ mode. 

Referring backtc R'jjjr^ i;D. Ali:iirn.i:u li : :)! u ' '^Ji u ..aL»>:-s T.^ni * ^ 'ir: options. These options can 

be set to display WelvafindrLogrupJuCU' ..ju ir.t,. d:^i\ t*:.r . r. i^^.. ;.iu;tjr.; v.a"L: ./ 1 ilecung 'Row" or ver- 
tically by selecting *'Sic.:A". iD^'Jy uwi :/! ..rvi^-c lvj *^ ^./w t.^.:. .tvfr c. v trj ni ^. i Ci'vj.: j. itis i:;i*u&oing. graph- 
ical user interface ^ 13 Jorr<355p:»'Jo to a oi 

50 WWWBrowser io ua*) opic it 1 ;5i:c':j.-L.>. v ; l.c h ViJu ;/Sii':-.0'vrKt iil. » . .^ju ./llJi A'ebFoiiTiatter. As 
shown, preferal>ly Ngu".<..-^.-s, ir;it..ii2. L^.a/L^ £/ ./....-....ii: j „ ^. . ^uitr browser 

options can also b€ p{ wV..:;>J. ..-^hy c\. ui d Ln.-i. Tns default 

browser option is N3tSca:;c f itS . it^x - :. 

General optionjs It - ;.icluci 7 uc» r/.4it jfoiv&'&» ,j^t.». ]\7, jpc 1 «. i!..v.v c w^:ti»>f'. 113, "Warn 

35 before printing mori:! tha: y.^Qj^: :^u::u *tO aiJ .Vr;i»):f.i':r*ro.f.-ns '>:r;.tir. .;::c":r:iucn I^C. "Auto-start 

with browser" option lITs-ii, V5L.=::ir ilJji iou;. ; vais:' sul >maLCr.!iy vhi^r.L /pil/L'..;v.co. c cClivUsJ. If ti'iis option 
is not sheeted (which it; ih-i u^'ow/: , Vj\. .1-c -./ a':.! • t.^ -yui- i: cs-jlui! : rll^^'^jML; j'^ a './v i.r.:;n*aite! icoti in the win- 
dowing environment, i>e!fci/.i.x \ . /r..i ...i //i.ii* ..U: '.7;. ..a'. ..U; . ^Vj.jj, c ^ ' j '^r^^^^iin^ a JrlL yrom the 
Web browser into tim ViUC:rc.nxt..: k'j::. »fo , ,\: € = a:,.;;i : • Tit^t.-^'iscl view" option 118, 

40 when selected, opeijy Y\ o,;.v.i:(:! :.! r /: i:.u : :» . . -^i. mode. "Warn 

before printing more ifvj'. i-c.^::*/ rjp:.c • : .a • r uj.'-s ;i <-.i_„ iv'-ici J ; Icwa user 

to control the number of .la-jtiSLv:.'. o*£'; iu" X'ai:?...nt 21. c'lliiisr-jountn-f .-r =m:r , uj*jc(* j&shJ b> tliose pages, 
respectively. Thedefe'.u^ hrh: j: :: os .* w;iL/ii c a., l . ... * • ..j nii,. ; , n: a-] • • • x; j * ^;..t :r the gen- 
eral options can be fS'iie v r :i at the- sanr;t ; ri ; 

45 Preferences dial:,;; i>u> 1 1Ci a^o- i<..,Lv.^.-. . . tl bj*...i U \ i.\ u .:«^.c .5. .-.sc. w ^ >i rilar di !ces and OK 
button 122 which cofifi:!Ki3 a Lim':. tiWiUO.Jj ,;)}c,i«j'.irio«s. 

As explained abo\^, VVab'^crrii.fUv tt; ccMTf 'jjrcc i . a. ^i, ci.ra..t^;f tr^j :1j ; \\\ .i '3j viu Preferences 
dialog box 112. or a uui; ».xr ^^nuv :r.r " nis^a^^ n.jJo v;j/ii ■ Izi.^Q i^c > K.^ f.ho.'jn :• r\{\i x 02. A*j also noted 
above. Figure 9E sfi^ivs ijr i ux%U o': :! . i lical uitr iKt . 1 16 '/Vi.br-:4 r r. J '.S.'.::? ';jud rr.oJie. Gnaphical 

so user interface 116 is d:r,:.*iVot* » ''-^i'" -j) '.Jx:fiiro : i " . if i-.r^ \V.. ' Mii, a uic^ vieivs a Web 
page, the user also viivis v/j c'::>I"::^ I ..i^m i .vc Tat^t ' 't ^ . ' :\'\ 1 .4:^, *> 1 • »..'. ^.,v>' '-^^^i" interface 
116 (which, in Figure ^l., I ^l^Aii^. ; .^vica! r: > l. l.. .... •. \.. . :lv iIilwu ji&i:: .:i:al j£;er inter- 

face 43). the user car. ccpa: 'a lh»i I. Tc I ^iopa^ '>u it . v:a» " 7 'i i. anu save, edit 

and/or print the RTF fii J.. <\Jtei ./ !•«■ i.su i. i..-. . - J^j",*.- -rs ..../roi, ..K^LViiC. ^le icons. 

55 A user can reconrjif/u *i V.*t Tc . lal . - 1 .1 .! .• tW'r ui ^ \ j J * d Jwi : ^ • cii ; a - : u : . J . i n :c b L utU'n. fiis action 
causes a praferencsb ;y >,.<vjcjjj. '..crcob, ..ivV :. : :vrcb t- ku.^;' « Tncifiafter. t^le 

user can alter the ccu'ilg^:: ..o.\ '/Vi.bF ::/! l I: hi at: ae:;ircc 'W.ou h user v^i; .r. /:ue: i f .»»iy-;i;. .o'Joriul nrode from 
the minimizing moda-. ih, . :.tcJ iMCu ^ ^ . :^ .niJ..... '. ; it. I "J J .c ..si : ; / .^^^ t iT:. 
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Figure 10 is a f o.vc'iag^rr ierc '> g *he 'if*?.!!* - 't'A^jFDn sM:. '.''^r -n vntttr I' activritad in step SIOOO, 
As described aboi'e Ihrscar^c-r'oiK Ivc'oubli r^" '» . ;*'':-b='cr ".r.Se * ..r- pa irdc- .-^fenwrcrmsnt Depend- 
ing upon how Web?n:jrntst!sr h2s b- 3 ' c^ri'g.:''r : . ' . , f . \uZoii'J : . ' thr r . n n!?;. 3 node, either a 
graphical user interfec? i>! rilEf lc-!h3t rJ :,raonic : ' cr* "* ^>')e s'.mW : < ) t : ^ of r :ph*ci* user interface 116 

£ is displayed in step SICl'O. tH»Mtk" Tconr; r. " t^i^i--:: -qrQ ' xi5 '.str j-ilartace similar 
to that of graphics! uss: ii:;ter*ii:e 41 i" c^?5,}l:y3d j. si/ * » :^ sines Jif ds:- 3 o; Wc»^r=D:mdtler is the fully- 
functional mode. 

Next, in stepSIOOi. WebrcrnriaUeT'sccifigured a?^ riescMsed above via r'sre ercas dialog bov 112 and options 
dialog box 62. This step is not rtc€SStt;y unless a jse: i t.-i£B 0 cha».&a W^^uFcrTiV aiH's pr^viotisly set configuration. 
10 In step SI 002. dDcumr.nt forrr-ol cata is input in fields anc' vS to 49 descicea iicove. More :ipecific5Jly, the user 
inputs a URL {oc filename in aiternadvi ?nbodiments» >/..• URwiieid <4. As d€^crib:iO belo'-v. WabFormatter uses this 
infomiationto processi'A'ei) pages iio\&i attha U»^^ i •■jiio'wj^i nTi- fiie oss^hj on tne o^niiguration of WebFormatter 
and the data input in vields 46 vo 49. 

In step SlOOo. d Who re£:d?,r i,^- I'j Ihui ui A'tX '£iC(^r M de^wi tbecJ auO/e is eAcCUi€;Cj. The vVeb reader con- 
15 nects to a network '3**^-' a^ >t) iuj Wk.;, ii . ^ . C ^-^ 

Next in step 813C'.>. it r, tioianvi;.v o .vliytii.r i; U. ... .r <x .i.arian.ii »ias u?....-; ^•..t:.,gJ. As JeiCiibeti above, in pre- 
ferred embodiments of '.wabi- cjiviaCiii, ciiy a L M . 1 a; l.-.; uueito. huwevei. nve d iif!rnc*ii«cr eirtbodirnerits of Web- 
Formatter may permit entry uv ci. u'ter<at i<!. a d(:;aoiit.<bt.i : ^1 Httv^s.'ini(.4 a iiic; ct tai :iti Oi.a ot a JHl. acdress will be 
provided. 

20 IfaURLhasbeenentefeoirfied'Vkptcjcessir.gpivVjysdstoste^: S1GD6. ir r.Uvi £1005, the Web reader accesses 
the hypermedia ooc^.rianc {^.g., a ixv iepisg:!) spGrir.iei: by I'vi UP;L iidd-si^i. ' -i 31007, WabFormatter instructs 
the W^ reader Id i'Hvirso Ihi :'!y,:.iji..)a».:ia clcrCJin<i.';t. '» ^GJtic,rtej, \\:bla*:&'..i\' Mi^c^. iddiii}SS(&5) from the 
Web and stores vhe cdc/iisseji c jitiE-niv ?S. Cr.c-3 :i.l ..tc«s.rc;J «4Cd.%Sr.i*.es 1 ia\'t: bsri . iieicjci^^d tine! a processing func- 
tion, such as prim, hau ce&n autiv.jttsj. u^/^brotV-ialisr ur-v.r.k^a.l.^ dsti slot 01.1 at t-is aucir&ssur. in cotr'ainer 76 into 

55 memory 5. WebForm lilo.* iie:; e:ttrac :s p. ^dwtar.Tm-tJii l 1i u: iha tir/iin" aaci:.iJ da baissii .';n the ccrif iguration infor- 
mation set in Opftiona. di 1'. box <>2. ar«:l stores the tXi.-'atAocI data In ( nemory 5. fhus, »*or example, if Text Only" option 
72 in Options vVindLw c> i'o jr,. cvilj* <ax: is tAVjjc.a. i.or Jie Jm/ fuiotbd ?^oc^i>\i\l\\; then proceeds to step 
S1011. 

Onthealhef rariC* if, in r» iCO,;.c.fiionL.nc- !uMn •riivjLii:..v»,ujh:aic;6 itefciu. ^VfjbroinuiUijrlu'-itructstheWeb 
so reader to access a firsi sim iii ms His,, i*. jli^ps b' OCh u.iri 3 : Ii. die iito i;avyi'a .jnd (..Ji,a io exlrdcled and stored 
inthe same manner oS In ftey Xihi'I, dfimutxJ uWj^'^. "\u.., »• c:u; i:iiO .u, .'ivi^ . v^ai 9j oviorniin&s if more sites 
are listed in the HTML source fiie. f/ nor«i Siler* are us e:i r Vfc file. vlovi» rsLn • i>':ep SIOOS. and the next site is 
accessed, if no more sitejs ar t' pr^seMi, proccssirg ■'jrcv.?t;Ji Ic sl^p !?":0n, 

In step S101 1 , V»i5oFonuilter pioce.jc^>: i.t. c;;i.*hs:;rj a^.,i i.i acurj:j;....;e .^'il m.: pr'c iioi.-oly sei format informa- 
55 tion. For example, if Cc'.unns, vii^li! <tO *ii ta rr... * jVi-i. £^»o iL.«t.ij'-iJ.1 i,u a tft'!(^ rj': ^oi?. -niv -i cocurno'X having mul- 
tiple columns. TTie abci'LproovitiSiiiii .r ^.-iatsc (:♦> r,cii/ .upl,.: ht .*» X. I.m icjr' L' *cr otvi- io^n 99. and is 
similar to the procosalr :) driiicribid abi ,u '(ho J i*!: ■ c'.' 3. . . , . .'aaiii g li 1 1 Joci-.r.ci ft urd foririatting the doc- 
ument based on the j'nta:l:/s' .rj. .nL: ci. Acco.uirg::,. ,i UwX i=:i CiUct.. ipl: . r. :•■ :s jt ii'.uc vcr fia sake of brevity. 
Once the docuin'~. /tr> tv^^Sci JBL'S ii e 'ii^rod in rMte i;-r he \ <\ ij-^sA dcvir.iioc.d^G, f'jriTiattad ficcording to the 
40 preset formats c.{Xi tv>/ii!JUi'ai.a*^3, cu.c xnvar:cU ir.l:, , ^'f jirivip S 'rr. ins^i^P Sj'': i lae T-iTi- fil9{s) are out- 
put Alternatively, the RTF fiiei;f:=j Cc»n jo edite.: or Sc.i*.d. Gis,f.Hindin<j upon Thich ic 'i en tn? graphicaJ user interlace 
has been aclivatod. 

The invention te: bcc i run- i'jHd ?v v. 'i cl . n tc. i!uj/viii. e ati ict > Ji ^ u u^. It ,j Ire u ot-ood that the 
invention is not i':.Tte*/ 1:> tfit nt D/t ct: K.I's:d i'i.Cv; 'fe'/i .»; - . 1*/. itcv-i ; t. lui. ..id. • ati:»U: changes and 
45 modifications may I:g .Ti^d e b^h lonv a dif lury t'rJ. k : a -1* 'v i^icirf -ii ['p^ii \! ijj or ' *. :;i :3c:pe c! thD appended claims 
as defined in the .ivjpeaiioci oiM r.u. 



so 
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The Kser frcfllt is ii.i;:.H:.,.:.-.i .^d ir vJirrc^i:- 'ni ^.iLci 
fonnat. 



Count«4 



Heading«Nev/s In Briar 
Snotion-Trort Page 

HaxKByi es5-2 C 0 C» 

Teiopiate=l 



r*:v^;Qrr: Filter*:" Footbal l* A":'J' "49er3" 



f *P:rt i.e.* 3 i.j T i ::c sr. 

J.re:.Mror^Fi.I'':er==^"Coin:put«:r" OR "hardwarc-i:" CP. 

:?r:ir^:" ' 1 

h j^iidinn- Sri Lanka 
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Sec t i on»=i;port:s 

Ma.'j:P3Cfes--=10 
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^%W"Clay of tiie we<s2c 
^%s-6ectioii part;: UKiL 

[D^ifaultE;] 
Cciin1:~3 

Uj 

Titie^san Jose Mercury Wews 
Use en azae!^iawicila*aic 

St ar tDa tci«StartHe ad I ines 

HoiKB Pag€!«Fht.tp ? / /ics* w w * s j mercury . coir / 

Si3Ct i onUKJCfSfcittp s / / ^jww . s j:ciircuty • com/ %S • htm 

Soc t i or*CC'nnt=^9 

Sectioji ;>«= International 

r;.i^cii5on ::«Nat:ionai 

JiOC^^ion <..=•• Xx^cal & State 

Sect5.on £i'«Editorials Coimnentary 

r.e t ion to^Busii j rise; 

S-nct i on '^'-•J^-n.t'ertaininent 

r»TP.tlc.^*^?.l'--<T\?/t:l 
".ocal S: f!t.iitc!r--lo :. 
S'?.lto.rla:-i3 COOTi SA^t3::*^/«edit 

Snr/rts-'Spb:? 
Living -l:.v 

'?Hlfi--.Th 5 San "PranciHir^o Clironicle 
Hoit;e Paqe-http :: / /^^nrv . s f gate • com/ chronicle/ 
r5r»;r,*-. \n'iTjl»^~^"ht+'p : . si'gate . coin/<:ig- 
bln /c^T ^n^ vjle / article- 
V-^. r/:. cig i?1 /chronicle/toiiiay" 

;T-vrtion 'l'=?.d5.toricil 
J".-/: ion li^-0cit3boo':. 



Sports«spcr.i;^i = '-^P 
Editoria3.'=Pldi4;3rie.lj 13^1 



[3] 

Tit).e=The Jny N^i'us 
Home pagei- 

ab/c:i. J.iTiio^:?/ . html " 
SecticnCcii:*;: 
Section i«--LVr.a.i./-L:.s 
Secticu 2*''l5ciiO,.ivi2*.l 
Section 2 ^ 
Sectioi; 4^^'Ci:t:i.*5.i 
Section 5"=L^t.tcx3 
Section e^lrBrie! 
Section 7«=sHotS9bm» 
Sect:Lon S tvKu:; 
S ec t :u ::r. Hi 1 i;'r^.:r>* 
Sect : :ioTi i C «■ 'c .1 it. lau 
Section lH'-^-oIrilM.T.arr.ci: 
Sectic?i 



[3 .Sfjctioiiu:^ 

Bus ir A» li B^'L L . b i i\ uiVi-" / :..r t r o 

PoreigasKf oraiyi.;/ :iratro 
LetteirG^ltstterpj/: final 
InBr ic>:;=ir±-i:* i i:.:!! / .Intro 
KcU^^ui'fa-L ..'^'-.icACL ; is^ tr a 
PrcL pt o V ? .'fj / au^:y;o 
Mil:, 1:^1: LtLxy / iiutru 

Obituu:: icL • cJ: -vV •u^./rai; ii.tou 
Spor t:fc;===spcr u s / ijvtro 
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SYHTM FOJi M'mUm^,^ liKRkCTIOH MID PRSOTOG 



Maximum ieveir to search: Ma3<:Levels«<#> 
-1: to r3v:ri«f;VE all ic?.vels 
0-n: to i-etriuove up to n levels 

Maxiratjjti pages of the ciocuaiiyit; i42ixPages==<#> 
n: final duniiment not moie tlr.van n pages 

Maxiwun si::e oi: the docu?-.?.nt2 Ma>:KBytes«<#> 

nr docunont slse not more than n kilo bytes 

Exclufs iC'H ru les ; 

Date-toda^/ S j.eijsstfean > 

today: retrieve only articles posted today 
lesssthan <#>?n; retrieve only articles no 
m)rf> tban n d^ys old 
.lietrieve-??!! \ noBubo^.ir ! nothiscLlr 1 thissiteonly 

all: allow to fetch pagjis from other sites 
nor^nh:^ it: : cxrloc?.'? U?IiE to ^.ubdir^^.cto^ries 
notiiitrdir: exclude UPJiS in this directory 
thissiteonly: fetch pagers from this site 
<:/nly 

Keyword s^»a rch ; 

Ke;i^vordV'i) 1 3X:-<kayv;ord> {^iifD | OR | NOT ) <k£iyword> : 

acc\Tnu?.ato only pages containing the 

comhr.natiion of ke:v^vor<^s 
KeywordX3:ank«<#>:n: use txi.zzy logic to r?.n'< 

pages according to keyword combination in 

KeywnrclFilter anci keep top n ranked pages 
Ke:v^7ordAuthor-<atithor.'': accunuiate only 

paqes atrthoiRcl \yj author" 
ExcludeTy|v-'=-r.ds \ nonFJna lXT>h 

ads : b:cz j ude5 advert. i.si enients 

nonEnt^lish: exclude arti(?les that are not in 
iknglish 

Flatv.enin*'^' rule/? : }?rint==all j lainveK i l®vel-<#> 

all: incl'i^r'.'s all ncders in the tree in the linear 

dor:u;iaent: 

IsHivesr i-ncrludfi all It-^aves In the tree in the 

li.ie'ir document. 
level--<#> ?.nt include, up to nth Isvel of the tree 

i!i t.ii?. liTjsar do*';uiaent 

Forma fctinci .ral«'=j? lVB!Bplat<A-<#> 

n: print n^^ordinq tc iferault or user template 
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DE6CPIPTI0K OP MODULES 
Appendix 3A 



THE PBRSOMl^L liTI^i^ PnOL^TSiS 



The Profile Mitoj* nananes? ai^wss to the user 
profiles and is ?epr3J3sn1;6iC. 'sxf uPrcf ileKgr Jlass. 
It also laanagei l: i.d.ing aiu: {"atiing of tija profiles. 
The serviCias provided by PT:ofiie Editor are: 

BOOL Wciwircf ile(C.Stri:itc f ileName) ; 

Creates a new p:rofile given 1;he 

file name. 
BC:OL OpcfitiProf ileO; 

Opsins the default profile, 
Bor.h op*^nProfi.Tu(f:%'^5t:ring f tlA.pfaine) ; 

Op€»r:s the named profile- 
CProf J liH Entry* t --M rfst Entry Q; 

Lead a' and* rotiirnc the ne?ct 

P'-c .'::l%- ei* .3; » 

Loads oP.d r^s'curns •che iitixt 
profile ^ntxf. 
BOOL Wr. i'lZsEutry (Ci vofileEntry^ entry); 

- Saves a riew emrry :ln tr-e profile. 

Each profile <intry canteims an evtract.ioii 
specification anr! an c^utput spciri^if .ic:<it i.ori an 
repres«j:nted by CJ-rof ^l^^^.^'■l•:--'^ r;:i.v.«:v,. 'JMie rnethods 
provided arf^*; 

CUPL Gett it-o.V;?('i; 

po^,>j-v-».o ^.h« «i.it^ Id contained in 

tne prc^:ile i=:ntr'/. 
CExtract ic nsp'vc f^.- 1- p^'/rtraccionSpacO ; 

Retv:.rri3 '^iu^ k?x'*:.r:ac"l:io\i 

prcfil<3 tiintry. Extraction 
spocif i::ation contains keywords 
fcr r^t^^'mtir^j , M^rat? for levels. 

Returns -he oa'.put specification 
oontair:^'^ xr. the prrofile entry* 
0\\*^,x/: f •. '"v*"ication contains 
I'-'-^jro.t*"! ••r/T-yr-.tinj <H rd tree 



1«> 



THE web RaJiDHR rjOQ3:,3 



s CVJebPage class ^.b-s^ra^ts iT\t.^.r:jj^of^ to t.he 

Internet brovssr atid is present a 'clve of the actual 
Web page. It t^fill be refiponidblt* for fetching a Web 
pagef extracting links or refr2re:icef5 to other URLs 
in the Web page» and Tiain:;aJ.:ni.ng the coat?^nts of a 

,Q Web page. The D«^:hof:l« p-wid '^^S are; 

BOOL loaxlO? 

'J^,tot 1:':ta "'^eb page using the URL, 
ujjs^rn ',r.Ci and password. 
IS 30o:» Pcrr -'^f 

Parser ths data in 'che Web page 
and C'2 3at45r; e. litst of links. 

roi:ilvet: '-hij rfale/h:l-;o URLs 

so CUi;iLX*ir=>t« CetLiulveQ; 

i<;eturjTi^ -dcie i.Ls'c oi: links in the 

Ci»age;.5ata^* Get^c^aO * 

i^eturi'r.? tha ^crtuai text data 
^ oantai/ted in che Web page, 

void Fil te:: Cont'HntO 

'V^xtva rcs t .tX«$ and oth«ir 
j.rifor:ratioa according to the site 

estrone? tie' I : 

'*'.B'':v'V3:^n t-'Me ?,:;id ot"b.t*r 
inJio'^^-ition aocor'iing to tl\e site 
data. 

^ rst'ri ig Ge-- A'u'diT, -f); 

'•ioru:! 1'. tiuthor of the Web 

naqe , 
int GatSia^O; 

:'i:2tur:i=. vi^-e size of thii data in 
40 <Ll.:> 



30 



CKetwork clas^^ a i ..\.ii-.a OLE fun :'" i :)nality 

and provides coiiiun m f;/:li Iritn rnsit: 

browsar- 

•JSCirl K/ Gs. !-\^.?.0/ 
Ji^:^i v'.nc^ 'ihts cu£.*rently riet 

void 53:tTJc':a:'at^.f: (LPCToTt^) : 

-vi.v'rRnt c=:ern3":e the 
tw -/rjecit. 
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CStr ing GetPasswt Q ; 

'Dat€cn'i;'.n('* ci'*rj:r;ntly set 

void setPasswordi;:LJ?criiTR} ? 

- Sch the cart^iHt pasisword iB ttee 

void Closed 

Olscomic:^c ^rny s.ctJ.ve conmectiion 
and rere^c '^.'i^-.i ^'JIS at;i? ork object:. 

short Read { BS'.rR*] i u v: . ' ' . r ! >r itoowint ^ ; 

Read d*iJ:f» m /3d by ta& 

long Get3tati2i?:0; 

Qrer" l.hf ir:.nt'^if3 nf the ciirrent 

BOOJ. CpGnCJ.PCrHTJ. ) r;ri, ishortiMeth^^d, 

LPCTSTR pPo?itD<;te, Icrg ::Pcs^Dat:n.^icc, 
I.>?CTSTR pPo:ti:^p^.f^^rc^ : 

.I;ii'i;iat.ei?5 'Mn-e s-etrieval of «i UFL 

CStr iBig Get.Fa:rir¥p^riJ?!-'f = () ; 

;u:t*^rn< V error 

>-'^ir7rtec' loy the .isierver* 

tl'C t vitrcaiVw ^total 
c:.::'.Li-/;. 1». ve-ij) of the current 

CString GetCoiitfiUi 

R4iLuxik vljii l.,H*.' t.neoclintj of the 

CString C4et}i>wplrt \>Q .} 

Hetuiii the latai retrieved by 

Uiiii Ic Lu no longer- considered 

CString Rei. <• » o : L Ci'ivy/it ^-E- :» ^ iiPc:i?OTii 
pR:: at/A e) • 

BOCL IsFini CjAQ 

Dolc^riii:. r- J.;* c- is ooraplet®. 

short Bytesr.cJudyJ^^ , 

- Jii/Oi'in ot the nmitoer 
oi: irivtw ; to £)e read. 



TH13 SITE OUIVi:iI^ M0D5-:.B 

The Site Driver vlll ]::c: 'Irlr. the site .Infonoation to 
5 the Web Readnr. Th.-: i-.L^irt Dri -G^r :1s ^ujictioaally 

similar to the profile :ri:Lt.jr and is represented by 
CSiteDr ive:-r ef:lcfc:s, Se^rv .csj:: nrcvid^'^d a:re5 

BOO.,* i^*4iyI?.ro<.i^3 Cou ing f iieKiune) ; 

Ot.^.iat^£ a j av profile given the 

BOOL npniTil^rc ^ 

e>:fin, f?.G nefaulw prof lis. 
BOOL ore:'iPr7i\il ' >::.s*-::irt/ f ;.lti?laipe) ; 

C t.he rair.ifd profile* 
CiJitePr o1!i le^ 'IJ =!tFirat.?iteO ; 

'l -^.:d- :-L .-o-rrit! th^= fi::i•^:: site 

20 '>j*=tii3 ^n- i'etv>r>.7P5 tJxi Ltext site 

BOOaj .iic-ah'-i :r^^ ^Coai i:?iOf i.'ie^i en'cry) ; 

- r>r.-^'es 3 ViBi. Bncry in tae profile. 

int W?X(!ihe):Otsxte-:0? 
25 I'iietur't^ fiiuailaer of sites 

6iO. An -ne profile. 



An entry i;; 'dau. j.It j pr ...'li;;. contain 
3p information aj.out t'^- of caa i^it-:, Uitla of 

the news cou:..'J:?-j i:.v, ^.-.i^/'i .^lor^ut hoiAf tc -i.r.Liess the 
site, and 'm;-r ' c::-.':- ;:r v ^f ' :'TA.:.Lvon suci as saction 
data atc« an-i -^/'i.ll b'- j^^ v.e -Km". W c ;\t ViiTtry 
class. Me*::lic>d£t prov*vd'?*.d =3.ri?5t 

CStrx:acj Ttir.,:, ; 

.^.'i^:ur^ the base XJRL of the site. 
CScring Getn.:;^: .* m',.; 

1 ^-.-i' UKi'.rna.n'--: for the 

- K*icut I -J the oHSsword for the 

-fi . 

C3 t'l' * ; ? i. r. i. riO ; 

- vtt2--\:; n'S'jivor-l far the 

!> t ' " i v. / 7^ I '.^X ■ " 'I \ r 

tJ. :ri .i r.-'.i .litle of the news 



so 



int :5ec':iorf' :ou^ t }i 
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A pp&ndiK 3B 

TREE HANAGER HOiWUB 



Tree Manager wHj, wain'»:fi:-: . the :!iost central dat^ 
structure in uhir: prograir,. '.h^.cn is a tree of ^^eb 
page nodes and i.s reprciisi^i^^. ^d by the CPageTree. 
CPageTree will tiraverse tLv^^ W>^?W to retrieve the 
necessary Web paqr-s acccri Ji^^g tt^ the extracttoii 
specification &nd bui"n.{ ^' = l.r'.a* The aiethods 
provided are: 

CPageTreeNode « Ge:.'.ootO; 

Re'Murns -rhe .toot node of the 
tr«e . 

BOOL Bc.ild(amL r*C, CSscteactionSpecl: 
spec) ; 

Bw.ir.ri.E; t]M' t:re© acr-crding to the 
perec^nfil. new.: [rirofile njctra^^tion 



Each node in the page tr &e i:a repressanted hy a 
CPageTreeNode. Hethoclis prividcad a::e: 

BOOL AddChildiCWebPage*^ page) ; 

kdcis a :j5aid node with Web page 
data . 

CWebPage* GetPacre\)j 

Peturr.s '".he r^eb pf^tre car^air^ed in 

th« K'nr*k'r 
int NumbssrCfchi " ti: «' I'CK^ 

Rei:\?.r::ij/ the ntuub^s^r oi: childran 

be 1 D t f i; to tii node . 

BOOL XsLeiurO; 

Returm? VtlUK if a ieat node, 
i . €i . , m J ch j. dr »?.n . 



To travarne thn l^e z prtue tv (ie, 6 CTr-^ ilt'=ro.to.r class 
is defined \;ith uiflereiit traversal, mebhodfi. 
Methods provided ar-. : 

void ::^.er::c:0; 

and i*/::;. • aij.:*.:r.c^ state data, 
cr age = -jII ics ^. 0 5 sc* .NocSeQ 7 

l^et.i::::- rci>:1: noda i:\ the tree 

in a d=;.t:i r:.rsu search. 



CPage TceeKocV^.*- riatMiXtSibiingO? 

- Retv.ri z' tli» naxt node in the tree 
in a j>realth .:irst .search. 

Rstuzi ii thci next leaf in the tree 
ir; a ;€ipth f.irs^: search- 



Input to this jicixjiiils yill ht5 *iiiQ Web page tree 
created by the TT.9e Msriic^iir .2i?>d the ou-pii'': 
specification cjortaine^d ci* n»Gr profile* 
Formatter will traverse **he tree according to the 
rules specified in chti oi.Lpuc sp'^clf icacion and the 
final document will be fi.vrniab:ed utsing the 
formatting i/iJstTJ*:.c 10**5 ouupat spciijrlf ication 

and the J^ormat t:ing : vt - " :ed In 'ih *. W?:?- payes such 
as heading©: p^arsgrayh-: -e'.if. jX'^ts it:^- 



'The o^Jitput uocti^L-.-x: will he h\ H.ich Text Fo:rmat 
(RTF) and will te. accesaabie by many applications, 
RT:? is^' advanc«»d f or tiav : '.n.^- lui^guage for text, 
prov:idi:^g doci:..ji;'-*L:^ s:-:C?: 'a ^ c.\-^ pztrH-gr.-ip.h 
fornatti^ig^ sty.Ui ^she-.tfi ^rs: -r.r ::n:?. j^v-ers, and 
witii support fu^' Ct. : r*/ * f .ay: Corx- :i5 supported 
are DIB, DDB, OS/T. •:«»iiuaf '.le;s . Thure is no 

support for Wob r'ai^iv- r J A.r-j ri-h'.: Gtr' )::»nflat. 
A third paTi:.y l5.hi'ai:/ wi^l V'> be. pui ^haised in 

order to do v: ti :r^iiy^^f.c.:i*^-' •:»'* t:i" GI? ':. OTB iformat 
or one can I- -J Jevc.lJp:!*f. . » ;e. 



The prototype creates a H'?ni lile as the output* 



Th'2 foraatte-r irs rej.vr »:-v oha CFor:uatb:ir 

BOOL JpsinPr?!!^;! .;r (:'ii'-.L':'..ig .iiilrL^cVjie) ; 

:»p;s:iB ^ut^ it7.ua»2c] HVML file for 

Oioric.s and sav^es the iiiVlL lile, 
HOOl, Prixr{:f;.i'ML^'.:i.-'agei'rea«i root, 
COutputopecA xoriwiit^ ; 

Giveii <:he coct and the output 
sp*?cil l::u:,tion, traverses the tree 
and arlnts the contents in the 
Wee •j'liqeis; iri HTliL format - 
BOOij openCTFf i . *i «'c*>t rxng riieNarae) ; 

Opi?n:5 Ln^. fidcnad KOCF xila for 
oui:,\;»j t . 



void Clcsf:?R:T?^iJc 

Cloi;?'! r f. ?iv-r tiie nT7 file, 
BOOL P:rinlJ.XF !^:^ r-'C r oot, COiitpctSpecfi 

f on&at) ; 

Civca t^ir-. iz;ot .Hind tlii^ output 
&pa'^:L:l:ie/icl^^!a, -^.^ravea i:Sf;S the tree 
an^i p:ciii.,u ooiitei;t:6 in the 
SteiJ pacjfij iji RTF forsaat. 



BOOL Print (CPage!ft:t2e^ roi>t.. C'Ji;it.putSuecli faraat) ; 

Given the root, s.ni t'^e oufcptat ^ipecifi cation 
at,^ tnwci -(s«aE, tJ'yi TQih Zxni prints the 
con\.ent& :uu u'i^i t^.;:'. pa^,?G.^ to this default 
printer. 



20 

Claims 

1 . A method for formattinc^ data from at lead one Kvper vna f: a aoa'merrl. comprisi/ig 1h& staps of: 

an accessing step to ac cess ti^z r\ \ }3i>i o";e h'r:": m 1c CL':*r*rj"!l 

a retrieving st^to rertrievo rirl" ^^cn^ t^'^e hy ir-T'^fM-'f. i-<i'^jrr?'Y ntf en p"t'3rteri l5t-.i tree, wherein the data 
is retrieved based cr ?t:'i!^:ti.:'^ c,' Ir/a- h/cc- t?u . ' '.y; •**?r*: 
aflattening stef> to flatten t}»?5 axtrarl^xi cVi? ft ' . . ! n i n jn eM; irn 
30 a fornrtfittlng st«jp to fornat thr 1*1?= - dc»;jr^ v ' *' i fi ^ri < C'urnent 

2. The method of Claim 1. fiirlher co.'?jorfsir!v irp of j:r:nti-;ri i for r.at?«:d d*.c vinv?n\ 

3. The method of Claim l. wherair.sa/J h/pn iM-yi ? iiazj -r- i* i ; rrAi't.:] oi lhi ' Vor ci V/irJe Wet. 

35 

4. The method of Claim 1. v/herein said hypernr>&dia cocuiYumt is; loca*i3d on this Irternet 

5. The method of Claim 1, wherein sckl '^}p«'rmertir» y.c:ntfH'> is :c'ca"af m ?n ir trantv*. 

40 6. The method of Gai n 1. v/i:r2jn i ^^y>iv% (• |) o' io ' ; rt'i;'^ :m! ('r:» er;nrj :i8p. and said tbrmatling 
step are performed ir. ace r:., nr. • . :;i l :i- ^ »jr ' * * 

7. A method of creating a p iaTD ^al • n iv s : 'i' :^ w <• , r ■ ; . ' ' , . » t ' ^ j: -r Ti-Ltjd conrpuler network, com- 
prising the steps of: 

45 

accessing the ^)ype^medja-iirlked CvOn^.C L.Ie' r 
entering a learning n^odtj; 

traversing sites on the hype/msdia-lir ^ *.i cc.?: t * «• isl^vck oornnr ariu'* • 
extracting at haastona nJe iron I3iri cona.i i::-,: 51:' 
so compiling the at le^s^ one rule .rrtc tha personrii -ne*. i-p»af'Mv 

8. Vne method of Claim 7. iv^ierain t>i? r'^ Yk&t c-: a r. j :.*.•* ^r. K^rt'C j^:. chs.rsr.fe' Btic$» of sites for traversing the 
hypermedia -linked t^C'nipc: t- ' i Dtv . ' . 

55 9. The method of Claim 8. wJiej jln j-o at is :3f c^. l . ipc; *:<.:•; :-lc.r. bailed chter a for traversing the hyper- 
media-finked CQrr:,iu:Q: naU^c. \ 

10. Apersonalizationsystemfororeati'.gais'i.)!,. ...etc i.j M-. 1. :.'<V:bw;.dr{i:xv6( .:at£i (itrisval system, the per- 
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sonalization system onprislng: 

an input C€\'ice for i-^putt'^nci rft9 hj"^ :y3'-nr*r'*' tcr^sfi Wr-'d i^^^de W^'b; 
a connection to the World VVida W^b; 

5 a memory for storinn a V/^ resjj?/. th o v^'tI-. »r -'p' f**- ^ -cs ;sinn (he Wcr'c" Wide Web via the connection to 

the World Wide V.'eb aacordin? to f"oiT.tita-:ii5 * O'.i ih^ oersc.iaiissfon syst'^^^ srKii 
a processor ftff launching tia p3ri;ona7?:^:i:5?i ?7f.^^^^^ i -fisp ^rsetss 'ser cornm?nd, wherein the personali- 
zation system upon beirjg launched C ^ Ina -u'- tvr t*:'* A*?Jz ? aaccf. (2) p.cc^shs the World Wide Web via the 
Web reader, (3) enters a learning wJ.a. U.) sf-r s f^crr Tttinis to Oie Web reader to traverse the World Wide 

10 Web accordin g to user i:ornniandf>, (5) <5fl»*anhi tev^ ane r ile fr^^m is user atmiTiands. (6) comples the at 

least one ru5e into a persoraftzation pr ril^; =r .; sterns t'l^ psrsona Izsl oi piofiJe. 

11. A method for retrieving a'tides frori-j h^perTet^i? - ' krc! a^ri-L^er netvcr ; and for fcrmatting the articles into a 
personalized ne^vspaper, ihe r.rhDC' compriFlng '^i? r tto!: c*: 

15 

retrieving c: c:cre*i p^rsc i-'-r.-rjA ^fCM:'. ' cii rtC!, .£ ;;,ti'6c.i siJa ijr c. site on the hypermedia-linked 
computer netork. cuvir.Ta.*»d data fer access! -g iJa'a f ja-^i tl:e site, and newspaper layout commands; 
contacting IU& tM^ b:i3ec 3r; 3.:'t^r.=>S£ li iHi *i * ::d-*n;.".ti! 'ieu >; "dVs, 
downloian ig ar tici^s frjin ilie rjnc r.^i^i^c; - x- n/riS.. .c ::u*iri ^Xcvini trt ihe pi^/Gonai -news-profile; 
20 flattening the ar^de^ into ix iir^o^r ur,^ " v-; .". l\. 

formatting th£ ^irear doc jmur.t ivU ih*i ^ri::!'::?:^ .i'.wj:pspct ilc::c^J:^v^^ ic layout commands stored in the 
persona' newji-proliliJ. 

12. The method of Clairrt 1 1, iurthsi cjn^vES f . t\i j i i,n§ \'r£^ '^-z^'^ari'Mzi c nswspaper 

13. The method of Claim 11 , v^hore^i u\k hiPr.-.'nii^tj;:. i ili^^j trj.ri^vaf nicwc K ic tuj World VJide Web. 

14. The method of Claim 11, wherein said hyperimd.?-; ikad >;i'rq)iler network is on the internet. 

30 15. The method ov 3iuir.i 11, nl-.ur^Ar. sAitu i.ypiyr. ii*- i'j.Usc^ c:,7^ir:3t ni^tv^oii^ i£ ci: a:-: Intrcnet. 

16. The method uf Q:ini 1\. 4.hur«.ij! i..eci..7i luI .I X/rj: ^jia if ic'>::fj.cta;ui:of selecting art^ 

on a Structure of tf .9 fuh. 

35 17. The method of Ci:iiT. iG, \ivher&'n :hfj con^fnErKi c-'-iu t.r ■a.ct>::s3'»g data also irrcludea data for selecting articles 
based on a conterl of th« articles. 

18. A World Wide Web site data retie*/al syst« m tj sn:. '-..is-na at \ux\\ orre Wnb s'te. for retrieving data from the Web 

site, and for forna:wit^ ihc: di:*K^ ivl; ; r-; jyL-:iK\i jjl p^sirg. 

40 

an input device br .npUling dale a.*'0 oDr/r.\n' * i i I: -li. c-j,: 'i tic 'jVivld Wide iVei^; 

a memory fcr sluing a Vt^bt) :&::ti daia ;ebft'v'.ii ivu* j nich ^icuci^s «i Wub leacer, stored Web site address 
inforrnatipri. 4,!uied VJsb -siliH ..r. an j...! s.^:. L.vna: i»fe*;riCi.r.an, ^t!»3r£in memory also includes 
process steps to uoiMsecc .i: i VVio Sti--. l ki • j rr;. i u.cii s-X^'a: Vi& : :.nnec1ed Web site; 

45 a conni2( tlon [o it-j Vi-jfl:! vl?:':, 

a proceecc; icr :i._x;hir.cj li-vt. Ak .i S'U; ^In^;; '2:i-e^;ai driver i»i Hifiponse to a user inputting a command to 
access ths vVorkI Wide Wsb. \mo Ji^ .t)3 JiLc .*j|r.cf-iJ diiver, upcn oe^-ig launclied, ( I) launches the Web 
reader to t-orr lujt Ic- ihe Vju^ic' V V/sb ; ^ . J C/jiirticA- j, JC-:) rfttr u . as. Web sitts address informatton 
and Wob sri3 wOm;/«3:nd&, ^3] i.'M»£rLCis M'.e mV&:* i^ *a<i3i' tti access the \^^eb site b«u>ed on the Web site address 

so informatio/i si tc- Wob site :^.'jiJTO.u:In. 'A] r:i: . ' ; :.S,. ■ , :v otuh fi». .n t <2 Wtt oite based on the Web site 

commai^ds, um/sin tie dt'JX /• -JLVrrliJiidja ,1; ': i . L a utikco ii^i as to avoid hypermedia-links that 

form Icops Li:i i/) to a J:. ...j U:"'^ '.J -l, i u _ /lu l.. •uJ, i:e&n downloaded. (5) stores 

the WebSite data ui a iineai cia.:uiiariL. (Ji r vssts i^l^ps 1 through 5 uniil asi addresses in the stored Web site 
address a itc;.:ia'iup !:av-j 0;;t> 1 ' <ccst lu, u m j !'niu; into tfie personalized docunr^ent 

55 based on iiiij ':o r nJ infc * r 

19. The Web s{t<5 ji=:;a l.ib.rJ s^-il*.! jf C- ' * r.:*.^ P h :.Hi bxiiire- ^ i.Tc- rnaiion. the Web site com- 
mands, and ihi: j il: . f n3. . . i ..1 . u-. r , . y 0 -.is. ^'.riiC. ;^ i i-:. s- 1: .c^ile. 
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20. The Web site dat=retr 'iv?J systsir T^ C rnn 13 f jrlhs- 3 £ ^'htnr tor prrT^'tg the personalized document. 



21. The Web site date rst»ie.£. systa r» ci Clam 16. 1 i b ^jf5j. a!iL<so ctx^u]; ynl .epresents a personalized 
newspaper. 

5 

22. The Web site da!a rei.i6;'al byslarii ^ Cis 'irt 8 wL^. *'r: 1 pe;sct'ai.2«c c!; ..u.ii;iirt apiesents a personalized 
n^gazlne. 

23. The Web site data retrie?/al 5/ste''n cf Oai.'n "Al, ^/ntiis! i Si?. r-sicCiiali^eo ducarrtiit represents a personalized 
10 book. 

24. Computer executable process stepa stoi eti or. h cc^ Vt t^'^bis v ied: jm, s . d Si^po for accessing World Wide 
W^ sites tor retrtevin^^ d&la at tha Siies -uui <or so? rk.. > .£ I j .& 1. a a ji^B&^inat.itiO rJocurnent, said steps com- 
prising: 

15 

a connectinc, £tuL) o:.taj2Ci UjU..^. Ao/.r WKC' -/ i.:; 

a retrieving step to retrieve use»>ief i led Web $i':e 3dcr«i^sy ir tor naiion, Li-ser-de insd Wab site commands, and 
user-defined lornittting (x>mrnafius; 

an activating t^iop tL> ::.\.ti\s*G ^ \^'a^ jt^adt.. „^ a>^j;.\v ^ Vju.Ij L^.ia j.*) :1.= uScr defined Web site 
^ address informaticn arid ^etrieyrna cial;:i ffom tvriJ.n: tie Web s^ite based o'l thD user -defined Web site c»m- 

mands; 

a downloading siep iC uci^r.lciic r£tlhu\T; J e.:- a..«. uo*:. ir.o t.cai::$:ea »vGb sita iniO in exiracted data 
tree; 

a flattening step to flatten the- extracts*:) da*^ trs-'e inio a r. -^iaar dt>::umen£: 
25 a step to repeat t^v-j dc*Vi''^loiid'i:Q strip .MCi ihu j:** v .'.'t o r.il t?i! adof ;-;3'j;^>- .'i It a ussr-dsfined Web site 

address infor/r.cirn t'.rye boer. cc.fsu -iii; ja 

a formatting irtep to rarmat the storecl data Itnc tlii 1$ r^or aceU documenl. bas^o on trie user-defined format- 
ting commandi. 

30 25. Thecornputer(«uiaii3blepfccs£«^?tijpLi 3. Cidiin^'^ f.rl'C:. .crjpriij; 13 a spccling,i;t\3i:>\c spool the personalized 
document to an output davioe. 

26. The computer e^sojlabla pi ccess i^t^'i of '.Xkuh 2*3, v'le'e; % j\ , Miput ch^lcfc is a orrnler. 

35 27. The computer excaita'olb; ;t: Cvessii u^-o 0. JISiM L.;. v m v.:. 1- .ii .:upai hj a uispl^y. 

28. The computer executabia procesiJ iii-ipis Cla'T x Uie; -: •) u Lstir-uefined Web site commands include com- 
mands tor selecting d LiU^d on j ;»i ^ u . . . 

40 29. The computer execiifclii* iprc^:es-' step.s af C aiif Jo. 'a f vii ; j iiser-o^finetf si:a :;:'nTnands also include 
commands tor seh3':ting oata basse on a <cn:eni of t^e V -h? o.rt 

30. An apparatus for r'3tp:^\ ii'.i! *»:st^o f.. u:! ? '.rcrr rt i;.' ; i w • jri ih*,^ V^r; V " 'JiO'o arid torrrjatting the 
news articles into a f.r:.:5JL:*ir8d n&wi/)S ner, tte apps{r.at\s ccini&risina: 

4S 

first storage frcutv.: vcr t'o ir i { •' {3 r{.r.ftjl-'it,.i vr'c!* ct '-Vnc-ii dor.u -os data and command 

data tor accessing data from a We:) ute, a-^d (2) nsu/jpape.' Icrn\at comniajtds; 
retrieval mec^r^i i- ^ s t i^* i :'/:ri;' tot. J t/- 'at; «i c'.: .. 'rt. ac' therein; 
activating means lor activating & Web reader to oc^r/tdci a Web site based on address, data stored in the par- 
se sonal-news-prcfii..\ 

downtoading r cn/n It' cIcwiIuti:'..^; : arte-.. ; r • : a ccotcurtsd -Virb '^.-j br^ed on command data 
stored in the pBs:^nal-n£^i/s-pi Li'!i:. 

second storaiii rt'sa" arring ;i jcw.nfL>ad;£v- u; 3r t^'a-?.; anr. 

formatting memslor 'rattini} :hG CitcerJ n>m& 1. Jtfia i 4':. the ^rsonafixed nswspaper based on the news- 
55 paper format cmmandi. st:)r in 1i€> ^enso^A i u.a j-profiU 

31. The apparatus of ClfJ:.! rurt'i*. c. ..prlsir'^i (3p;j: 'i"c mo. 19 foi spoofing tne oerscmlized newspaper to a 
printer. 
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32. A method for formattng tfFjEjrcn r hypeimedm d' a /neri* into a personal ied oztjumant. conprising the steps of: 

a locatic* 3pen*y ng ':tep : s ^c-i^V a !or<::ic*. t. ths t»fjirn«c!b dn;^ rur, 
a type spfcrtyinc step to lipeci y ti^a tyo 2 .y iif jj . .sdia dociTneit. 
5 a scope speo:.Vi'''1 stap tc sp.c f/ S-»e cct.pt; c da:?; j •'^^riy.f^• i tj.n r.e r • .:cf.TBia document, wherein the 

scopeisbusso cnaatu:t»u3 .Vi .ahi t'C-n-.i, au:..;.3rt; 

a format sp« d.ying sl^sp to specny a format j. 1 fonrr,:Iig tho data r&ir.ev^d frnm tne hypermedia document 
Into the persDnaf cied doctrr.int. 

an accessing slep to acosss »1ie hypermadi:; t'.r jmeru ^unc' at toft locjinon toecifiijd in the location specifying 
10 step; 

a retrievinc step to retrieva c1ai=i from the !wpa 'nsdla cocurrterrt accsss'^* in the accessing step, wherein the 
data is r£tr eved in accordance \Mi the iyp^ ^\y:,*M^ .1 i^le type spscilying step and in accordance with the 
scope speci/ted in the scope M>^c>fytn'J ^^Pi 

a forrnaitr/g ste^ Xo tui n lat uie Jtri^a. retrieved « > : '> li r tiCiic^ying itep in 10 Jie pe. ;:ona'ried document, wherevi the 
Iff data is fcrmaltfjd in accordance with the lornia . LfienniaJ in tlie fcrma*: speci> ^ing s:ep. 

33. The method of Cicjv o't, .'alli j cojfi;;..sni apr a... .^p.-i-;tf!2i:,o:sonci xed document. 

34. The method of Claim X, v.htjrv.if; tM«; tjci\ u.j iO(;f:tion spjd.vir j step io afilename. 

20 

35. The method of C:ciir/. :J..., >i-hG.v'r uir location speu iei i.i in<. loctaaon sp€K:ifyint| etep is a uniform resource locator 
for the World Wide V.' jb. 

36. A method of p.-OG£.5siitg a hyperrr.iuiia document :^jiv^i:.it:>g iht) styps: 0.. 

25 

accessing th^ nypenuudi:. dc j^rn^rit: 

extracting iicidrtioie^ <vof:. :nt r.>pori7»cdjci ±1211. ttil 

storing tho iic^.:icsSk«i€o cxrrLulo.. .Vantf'/J hvpjrr...:C:i; c:.<camx/i hi a ccr .a.r...-.; in a memory; 
activatino a prQ:assinQ kir/Aic < pio^e&o Csaia jitorec dn aodresses S':or3d in the container; 
30 downicaG.rtg i.he cata s::h ^ji al. adcrc-^UiS /. he. uO.i^cin ./.KO-..tc;.T!e:-»'X ^; 

extracting preddti^rminac' d«xta from oov^nLsdet uaVt in accoruance wvilh preubierniirted configuration informa- 
tion; 

formatling \hv prec'etdrm;r>e& aBi^ in acccfdarice v»iljt ^^-eo'eiined ioirnattii 9 sewings to generate a formatted 
documsnt; 

35 process:nv;; t \ti :a \ na^ec) doC4J? r ent in accorcia v i b v^i*:;! ^lia i:^ Cice^ssirtg 1 unction. 

37. A method a.c»:urd\i \^ l3 C!ai.T» :;£. in£v ccn. pj isir a ile-; |:r£.»i:ci«;i .g i;e forrrjatieiG document prior to process- 
ing the tomnatl(*i document. 

40 38. A method ac^crtilnt; X' CUirn 3V. .a JiSr con*pi£...j. . J5 u-eps of: 

changing the iytwi^-XfiQ sti tiij^j, 3 afier prs^i B ; u , .ant h.rx: be^it'ti | j:c:;By3 ng tlvs formatted document 

in aucorduxv uWr J u jrv .Oi ui*, lait.ic- 
re-actiuaiiiig , ifocsrici. i,, Eu.^cl M!; c 

45 re-forma1lif:g Jic dsitr. in a(rtio»d£«ico v&;h t^-idi,* i-nviutlin; ie^rin^t generatEj the formatted document 

39. A method accoru:, . 5 to Olairri co, tv; ein Ihe «U;&w a u stor.si in H ^e co ila nsr in fiie ordfsr that the addresses 
are input iniw Itie ^ c* ;a..ni* '; uir.' 

wherein tiie processing func'iion processes th& predetermined data in the order that the addresses are 
50 stored in the tcaUainu-i. 

40. A method aoroncmrj 1o t.^isi.T 3?. ^ Iher ccrT\f)asi.i:j ina sl^p cf reai"ra*ioing the Ft^dreisses stcred In the container 
by draggino ; nC j.rr-j:uii'J^ ii^t cOC. : >! • ' . . . ' Cc :f«." 

55 41. A method an:Mf J»ry tc Cl^'r ir. * s: o\. ■ : i * : •) ' f.. 1> t = ■'2 t i;^ ::.'3r!:ng3 and configuration 
informatior jl ; r- : :q: *. J .it; ' ' , ■ j:. 

42. Amethodac.:nM; isti; CUim. wl. M'-jiJr. l ie j.. ' 'nltraiCr : .ix:>.:t.,c i.!,i;Ll procesang icons, oneof 
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which activatss ths c/ocessi ic func :c • 

43, A method accordino tc CNt 42. 7''-£r='^^*he gran- 5^ k .- - ?'f^r: '^^ ^=r.^'-ya:i lur^jl maiss. 

5 44- A method according tc Ci<iirn 4;?, w'inr Hr tha p (ural mcttss orravise ) a fii:!> -fijnc.-' la! '^ode in which the graph- 
ical user interface cf *a>' > fc'na^^"3 pr/'ce*^^ ' v' r*"?. its'"' ^ Jind c^?c3S5!'^g icons, and (2) a mini- 
mizing mode in whicJi the grapNoa' jse^ iiterffiCfe d t/s •jii*y "iie fnoces^in-: icons. 

45. A method according to Q ai^ 44. w e-eir -he o/S'pkiCc'! L i dt irnafa:^. d^^'ayeri in the nininrdzing mode is displayed 
10 during browsing the hy oermedia docuniiinl. 

46. An apparatus for proc«S5ing ^ hyp€rm5C'ia dccun;4.*.i, i ^..:i.> jig: 

a Web read vvrkii c.ci.t:\yjs t'l^ r;'permsdia lirctir i :" 
15 means for extracting addre^JiGS ht-n the hyp£.rrr. 'Su: cpci .'lent- 

a memory in:;:i:d*;-:G cuaninr' r..ifi:slO'v-.5 v .i:. v.::! 2..t'./; ^f.:.'!^':.. ipci .led!.', document; 

a graphical ussr intvila:"; ^a^i r^^ic^iising xa^i r*'*uGh f:.:tivat'i at leas^ one :;r'>:essing function to process 

data Stored a*;: iv? vMt^ his z -3 i im -.-he ...o:rn. n'r. a.: ! 

processing m£.Tirs %'^\ch ["\) :}i : i : x'n ci.-: : . ' * nl 1 1 ir^n i/^ ' r^ r 1 ^ ontriiner into the mem- 

20 ory. (2) extrCiCis: pr jdsfsrrrjiied cm .licuik!:^^ "ud In i-ucc.- jai-.t;'! srJ) predefined configuration set- 

tings, (3) formats th3 prsdster^-nined ds.tci in gcco ciancn with credefine:} forn-attrKj settings to ganerate a 
formatted docu'Ti£.f!t. >* .* d (4} p'CiC.;i:ii;*js th £ fo:;n:iil-.:. d'ji^i.Vi^sn; •/! ,ijG0.Vc.fic.3 t'la pfvicessing function. 

47. An apparatus accotnit'g to Claim rur^-jer compiisinci pr£viir>iving n aans for prch^eAfing tne formatted document 
25 prior to processing frse forn-ar ed cl^-::u;«i2nt. 

48. An apparatus c.o:c:C[yj ^-iiorBt: -h'l' -y' i.i/^hes sre stored in tfie container in the order that the 
addresses are in^rj; ii'.io t! r. cc-rit?;*!'-!; .i^o 

wherein ihsi ;!:ic€'i?:./^ lurrtu. i::':JC-S3;iC > '"^ , .tr. '!-;i:r:Lci Jat? ..V; .Icr afat ±e addresses are 
30 Stored in the ccnt^.lr;<sr 

49. An apparatus aojccir »j ic CUiM':.: iu*v r c./4...::i: ; .'c,;.;^. tj : Jrop:.,^;, ravj'z :i:2j..ging and dropping 
the addresses in ih3 co.:bin jr >. cr^l?: cc. .Xi.;;D.' lI'U i,.id(e;-.:J'^t r ti^;. <,UiiX''.^i. 

35 50. An apparatus ac-xdi:.;,':: QdiP 46 r^;:h L»^r cc.T4:. .t .): i ^,^v^^^;^^ ni^an: ^cr !v^*tJf?^ l^ii; vDirn-afting settings an^ 
configuration information vis a jafiucdi user interface. 

51. An apparatus accord r.g tt^ Claim vj:.^.mm tne g(a:.f si:.3. ju^t interface comprises piuraJ processing icons, one 
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